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Abstract 

Pattern matching is a mechanism to write programs by case distinction and 
recursion in functional programming languages. In a language based on type 
theory with dependent types, pattern matching allows us to write not just 
programs, but also proofs. In order to check whether these proofs are correct, 
certain restrictions are put on definitions by pattern matching [Coq92] . These 
restrictions allow us to write definitions by pattern matching in function of 
the theoretically simpler eliminators [GMM06]. This ensures that definitions 
by pattern matching are correct, but also limits the expressiveness of the 
language. 

One of these restrictions is that patterns must form a covering, i.e. are cre- 
ated by repeatedly splitting on a pattern variable. Some dependently typed 
languages with pattern matching like Agda [Nor07] allow more general pattern 
sets, but translate them to a covering internally. In this translation, overlap- 
ping patterns are treated on a first-match basis. However, the result of the 
translation depends on the order of the clauses, even when the patterns do 
not overlap. This can lead to unexpected results for a user who doesn't know 
the internal workings of this translation. This is a sign of bad abstraction. 

This thesis is concerned with making dependent pattern matching more 
intuitive for the user. We do this by interpreting each clause directly as a 
definitional equality, even when the patterns overlap. In particular, our inter- 
pretation doesn't depend on the order of the patterns. This allows us to give 
definitions with overlapping patterns, which can be used to extend a function 
with extra evaluation rules. To interpret pattern matching in this way, we lift 
the restriction that the patterns must form a covering. Instead of this restric- 
tion, we give a more general criterion for completeness. In order to ensure 
correctness in the presence of overlapping patterns, we also give a criterion for 
confluence. 

By making all clauses hold as definitional equalities, definitions by pat- 
tern matching feel more like mathematical definitions, rather than program 
instructions. However, we lose the ability to translate pattern matching to 
the use of eliminators, making it more complex to understand theoretically. 
In order to reduce this loss, we will prove a theoretical result that gives an 
equivalence with non-overlapping definitions. 



Abstract (Dutch) 

Pattern matching is een mechanisme om functionele programma's te schrij- 
ven aan de hand van gevalsonderscheid en recursie. In een taal gebaseerd 
op typetheorie met dependent types kunnen we met pattern matching niet 
alleen programma's schrijven, maar ook bewijzen. Om de correctheid van 
deze bewijzen te garanderen, worden er bepaalde beperkingen opgelegd aan 
definities met pattern matching [Coq92]. Deze beperkingen zorgen ervoor 
dat definities met pattern matching kunnen geschreven worden in functie van 
de theoretisch gezien eenvoudigere eliminatoren [GMM06]. Dit verzekert ons 
dat definities met pattern matching correct zijn, maar beperkt tegelijk de 
uitdrukkingskracht van de taal. 

Een van deze beperkingen is dat de patterns een overdekking moeten vor- 
men, i.e. opgebouwd door herhaaldelijk patterns te splitsen op een pattern- 
variabele. Sommige talen met dependent types en pattern matching zoals 
Agda [Nor07] laten algemenere patternverzamelingen toe, maar vertalen deze 
intern naar een overdekking. Bij deze vertaling worden overlappende patterns 
behandeld op basis van het first-matchprincipe. Het resultaat van deze ver- 
taling hangt af van de volgorde van de clauses, zelfs indien de patterns niet 
overlappen. Deze vertaling kan dus leiden tot onverwachte resultaten voor een 
gebruiker die de interne werking ervan niet kent. Dit is een teken van slechte 
abstractie. 

Het doel van deze thesis is om pattern matching intuitiever te maken voor 
de gebruiker. Dit doen we door elke clause te interpreteren als een definitionele 
gelijkheid, zelfs indien de patterns overlappen. In het bijzonder hangt deze 
interpretatie dus niet af van de volgorde van de clauses. Deze interpretatie laat 
ons ook toe om definities met overlappende patterns te geven, wat nuttig is 
om definities uit te breiden met extra evaluatieregels. Om pattern matching 
op deze manier te kunnen interpreteren, heffen we de beperking op dat de 
patterns een overdekking moeten vormen. In plaats van deze beperking geven 
we een algemener criterium voor volledigheid. Om correctheid te garanderen 
in de aanwezigheid van overlappende patterns, geven we ook een criterium 
voor confluentie. 

Door alle clauses te laten gelden als definitionele gelijkheden, gedragen 
definities aan de hand van pattern matching zich meer als wiskundige defini- 
ties, in plaats van programma-instructies. We verliezen echter de mogelijkheid 
om pattern matching te vertalen naar het toepassen van eliminatoren, wat 
de theoretische studie bemoeilijkt. Om dit verlies te compenseren geven we 
een theoretisch resultaat dat ons een equivalentie geeft met niet-overlappende 
definities. 
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Samenvatting voor niet-specialisten (Dutch) 

Software speelt steeds meer een centrale rol in onze samenleving. We vinden het terug op 
onze gsm, op onze tv en in onze auto; maar ook in telefooncentrales, medische apparatuur, 
ruimtevaartuigen en militaire apparatuur. Voor zulke toepassingen is het van levensbelang 
dat de software foutloos werkt. Daarom wordt deze software op voorhand grondig getest. 
En toch leert de praktijk ons keer op keer dat er altijd fouten in software blijven staan. 

Is het dan onmogelijk om een softwareprogramma te maken dat steeds doet wat we er 
van verwachten? De enige manier om honderd procent zeker te zijn, is door wiskundig te 
bewijzen dat het programma correct werkt. Maar wiskundige bewijzen geven is moeilijk, 
en daardoor alleen weggelegd voor specialisten. Bovendien bevatten zelfs de bewijzen van 
specialisten soms nog fouten, waardoor het hele bewijs waardeloos kan worden. Toch is 
dit al een stap in de goeie richting. 

Om het gemakkelijker te maken om wiskundige bewijzen over programma's te geven, 
zijn er nieuwe programmeertalen ontworpen. In deze programmeertalen kunnen we niet 
alleen programma's schrijven, maar ook wiskundige bewijzen. De computer kijkt deze 
bewijzen dan automatisch na. Als er fouten in staan, weigert de computer om het pro- 
gramma uit te voeren. Door deze strenge controle kunnen we eindelijk zeker zijn dat het 
programma correct werkt. 

Jammer genoeg zijn deze programmeertalen momenteel niet bruikbaar zonder een 
grondige kennis van de onderliggende wiskundige theorie. Het is echter niet realistisch om 
deze kennis te verwachten van iemand die 'gewoon' een correct programma wil schrijven. 
Dus een belangrijk doel is om deze talen zo te ontwerpen dat de benodigde kennis tot een 
minimum wordt beperkt. 

Een voorbeeld van een veelgebruikte techniek om programma's te schrijven is gevals- 
onderscheid. Dit is een manier om aan de computer te zeggen: in dit geval moet je zus 
doen, in dat geval zo, enzovoort. De technische naam hiervoor is pattern matching. Als 
we iets willen bewijzen over een programma dat gevalsonderscheid gebruikt, moeten we 
dit voor elk geval afzonderlijk bewijzen. Een bewijs over een programma met gevals- 
onderscheid maakt dus gebruik van hetzelfde gevalsonderscheid! 

In huidige programmeertalen ontstaat er een probleem wanneer er meerdere gevallen 
van het gevalsonderscheid tegelijkertijd van toepassing zijn. In deze situatie wordt er 
namelijk gekozen voor het eerste geval dat van toepassing is. Dit maakt het moeilijker 
om het bewijs te geven voor de andere gevallen die ook van toepassing waren. 

Het doel van mijn thesis is om gevalsonderscheid gemakkelijker bruikbaar te maken. 
Dit doe ik door niet te kiezen voor het eerste geval dat van toepassing is, maar voor alle 
toepasbare gevallen tegelijk. Het is natuurlijk niet realistisch of wenselijk om ook echt 
alle gevallen tegelijk uit te voeren. In plaats daarvan laat ik de computer controleren of 
de programma's voor de verschillende gevallen samenvloeiend zijn. Dit wil zeggen dat 
alle gevallen hetzelfde resultaat geven. In de praktijk wordt dus toch slechts een geval 
gekozen, maar we weten nu wel dat de andere gevallen hetzelfde resultaat zouden geven. 

Programmeertalen waarin we ook bewijzen kunnen geven staan nog ver van het punt 
waarop ze door iedereen bruikbaar zijn. Hopelijk brengt deze thesis dat punt weer een 
klein beetje dichterbij. En misschien komt er ooit een dag waarop het even gemakkelijk 
is om een programma te schrijven, als om te bewijzen dat het correct is. 
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Chapter 1 
Introduction 



Pattern matching is a useful mechanism to define functions in functional programming 
languages. Definitions by pattern matching are given by a set of equalities called clauses. 
An example of a definition by pattern matching is the addition plus on natural numbers 

plus zero n = n 

plus (sue m) n — sue (plus m n) 

Here zero represents the number 0 and sue to the number m + 1. This example already 
shows two powerful features of pattern matching: case distinction and recursion. 

Languages based on type theory such as Agda or Coq are similar to functional lan- 
guages, only they have more expressive types. These types are called dependent types 
because they can depend on values. Dependent types allow us to formulate arbitrary 
logical propositions as types. To prove a proposition formulated as a type, we just have 
to give a program of that type. For example, here is a program called lemma, with one 
argument to, which proves that plus to zero = to (remark that this is not immediately 
obvious from the definition of plus): 

lemma zero = ref 1 zero 

lemma (sue to) = cong sue (lemma to) 

Here ref 1 to is a proof that to = to and cong / p is a proof that fx — fy if p is a, 
proof that x = y. Remark that this program uses pattern matching to give a proof by 
case distinction and induction. So in a language with dependent types, pattern matching 
becomes even more powerful because it allows us to give not just programs, but also 
proofs. 

With great power comes great responsibility, as the saying goes. If we are not careful, 
it is very possible to give incorrect proofs by pattern matching. For example, a case 
analysis might be incomplete or a recursive proof might become infinitely large when we 
expand it. This leads to an inconsistent logic. Hence dependently typed languages will 
put certain restrictions on definitions by pattern matching: 

• To ensure completeness, it is required that the patterns of the definition form a 

covering. 

• To ensure termination, it is required that the definition is structurally recursive. 
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If these restrictions are satisfied, it is possible to translate the definition to one that 
doesn't use pattern matching, but only the (from a theoretical perspective) simpler elim- 
inators [GMM06]. This translation ensures that definitions satisfying these restrictions 
are correct. However, these restrictions also limit the expressiveness for the programmer. 

1.1 Goal of this thesis 

In the dependently typed language Agda, it is allowed to give more general definitions 
by pattern matching. Internally these definitions are translated to a standard form that 
satisfies the above requirements. Hence this translation allows high expressiveness while 
keeping it doable to check correctness. However, some details of the definition are lost in 
the translation: 

• Overlapping patterns are interpreted on a first-match basis, so some of the informa- 
tion in the later patterns can be lost. 

• Not all clauses of the definition are preserved as definitional equalities in the trans- 
lation, even if the patterns do not overlap. 

• The order of the clauses in the definition influences the result of the translation, 
even if the patterns do not overlap. 

This results in discrepancies between the definition given by the programmer and the 
definition used internally by Agda, and confusion ensues. 

In this thesis, we will describe a new interpretation of pattern matching that treats all 
clauses as definitional equalities, even when the patterns overlap. For example, this will 
allow us to define plus as follows: 

plus zero n = n 

plus (sue m) n = sue (plus m n) 

plus m zero = m 

plus m (sue n) = sue (plus m n) 

By defining plus like this, some facts such as plus m zero = m don't need to be proven 
anymore, and some others become easier to prove. By interpreting clauses directly as 
definitional equalities, we eliminate the translation step. This has the following conse- 
quences: 

• Overlapping patterns do not follow the first-match semantics any more. 

• Each clause in the definition is a definitional equality, i.e. it has a clear interpretation 
that doesn't depend on the other clauses. 

• In particular, the order in which the clauses are given doesn't influence their inter- 
pretation. 

This generalized form of pattern matching has been implemented as a modification to 
Agda. 

In order to use definitions with overlapping patterns, we need to be able to check their 
correctness automatically. In particular, we need to check whether a defined function will 
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always give the same result for the same arguments. This property is called confluence. 
We will describe an algorithm that automatically checks whether a given definition is 
confluent. This algorithm has also been implemented as part of the modification to Agda. 

1.2 Related work 

It would not have been possible to write this thesis without standing on the shoulders of 
the giants. My two main sources of inspiration were the paper on eliminating dependent 
pattern matching by H. Goguen, C. McBride, and J. McKinna [GMM06] and U. Norrell's 
PhD thesis on Agda [Nor07]. A special mention goes to a paper by A. Graf [Gr91] for 
inspiring a lot of the ideas about overlapping patterns in this thesis, even if it is not always 
visible in the final version. 

When I started this thesis, I hardly knew anything about type theory. Of course there 
are many great introductions to the subject, for me the books by S. Thompson [Tho99] 
and B. Nordstrom et al. [NPS90] were especially helpful. The description of type theory 
that is used in this thesis is the Unified Theory of Dependent Types (UTT) of Z. Luo in 
[Luo94] which is also studied in detail by H. Goguen in [Gog94]. 

On the subject of dependent pattern matching, the paper by T. Coquand [Coq92] that 
started it all should certainly be mentioned. Other useful references are [McBOO] by C. 
McBride and [MM04] by C. McBride and J. McKinna. 

1.3 Overview 

Chapter 2 gives a short introduction to type theory with dependent types, in particular 
the theory of UTT (Unified Theory of dependent Types) [Luo94]. We will pay special 
attention to inductive families of dependent types, because they play an important role 
in dependent pattern matching. We will also discuss the fundamental properties of type 
theory: subject reduction, strong normalization, Church-Rosser, and completeness of /3- 
evaluation. These are the properties that need to be preserved whenever we make any 
extension to type theory, such as ours. 

Chapter 3 gives a general description of pattern matching, first in the case of simple 
types and then extending it to the dependent case. We will see that pattern matching 
on inductive families requires a new kind of patterns called inaccessible patterns. We will 
also give the three conditions that a definition by pattern matching has to satisfy in order 
not to break the fundamental properties: completeness, termination, and confluence. 

Chapter 4 describes the algorithms that are used in dependently typed languages to 
automatically check these conditions. In particular, we will define coverings and the struc- 
tural order which can be used to check completeness and termination respectively. We 
end the chapter with a description of case trees, which can be used to efficiently represent 
definitions by pattern matching. These case trees are exactly the internal representations 
of definitions by pattern matching used by Agda. 
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Chapter 5 gives some deficiencies of the algorithms described in chapter 4. Because 
they require patterns to form a covering, they do not allow overlapping patterns, and the 
result depends on the order of the patterns. To fix these problems, we will give a new 
interpretation of pattern matching that doesn't require patterns to form a covering. In 
particular, we allow the patterns in a definition to overlap. 

Chapter 6 describes how it can be checked whether definitions with overlapping pat- 
terns are correct. We will describe algorithms for completeness and confluence checking. 
We will also describe two equivalent ways in which definitions with overlapping clauses 
can be represented as case trees. 

Chapter 7 gives some examples of definitions with overlapping patterns. We will also 
compare our solution to an alternative one that requires explicit proofs of confluence. 

Chapter 8 finally describes a theoretical result that gives an extensional equivalence 
between definitions with overlapping patterns and definitions without. 

Appendix A shows how these modifications take on a concrete form in Agda. It also 
gives a few details about the implementation which are specific to Agda. 



Chapter 2 
Type theory 



Type theory is a formal language that is used by both computer scientists and mathe- 
maticians. For computer scientists, type theory is useful because it brings programming 
and logic together in one language. This allows properties of programs to be proven in 
the language of the program itself. For mathematicians, type theory is an alternative to 
classical logic and set theory. Each proof in type theory is a constructive proof: proofs are 
programs that can carry out the concerned constructions. Because type theory is used in 
both worlds, new ideas in one can be used directly in the other. This two-way interaction 
is precisely what makes type theory so interesting. 

The first two sections of this chapter give an informal introduction to type theory, first 
in general and then to dependent types specifically. Readers who are familiar with type 
theory can skip to section 2.3, which gives a more formal description. The next section 
then introduces inductive families, without which pattern matching would not be possible. 
Finally, the last section describes the most important meta-theoretical properties of type 
theory. 

2.1 What is type theory? 

Formally, type theory is described by a set of rules that can be used to deduce the true 
judgments of the system. As a formal system, type theory doesn't give any interpretation 
to these judgments, but it can be interpreted in several ways. A logician can use type 
theory as a formal logic, a computer scientist can use type theory as a functional program- 
ming language, and a mathematician can use type theory as a framework for constructive 
mathematics. The central concept in type theory are judgments of the form 

p : P 

Depending on our viewpoint coming from logic, computer science or mathematics, this 
can be read as 

u p is a proof of proposition P" 

or "p is a term of type P" 

or il p is an element of the set P" . 
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Under these interpretations a proposition corresponds to a type, and a proof of this 
proposition corresponds to a term of this type. A false proposition in particular cor- 
responds to an empty type, i.e. a type without terms. This correspondence between 
logic and functional programming is often called the Curry-Howard correspondence. We 
will use the words 'type' and 'proposition', as well as 'term' and 'proof interchangeably, 
depending on the interpretation we wish to stress. 

Each interpretation of type theory will give motivation to add new types to the theory. 
For example, we will introduce types that correspond to the qualifiers V and 3 from 
predicate logic. Using the Curry-Howard correspondence, we will also be able to write 
programs using these new types. These new types are called dependent types and will 
give a more expressive type system than in most common programming languages. 

2.1.1 Propositional logic in type theory 

To express logical formulas in type theory, we need some logical connectives such as 
conjunction A, disjunction V, implication — > and negation -i. We can use these connectives 
to build new propositions from existing ones. Contrary to classical logic, these connectives 
are not defined by a truth table, but instead by the set of their possible proofs. 

We will represent a proof as a piece of data (a term) that can be used to reconstruct 
the proof. The form of this data depends on the proposition we wish to prove: (Here, A 
and B are two arbitrary propositions) 

A A B To prove A A B, it is sufficient to prove A and to prove B. Hence we can represent 
a proof of A A B as a pair (a, b) where a is a proof of A and b is a proof of B. 
Conversely, if we have a proof p of A A B, this gives us proofs of A and B. We 
denote these as fst p and sndp respectively. 

A V B To prove A\/B, it is sufficient to either prove A or prove B. Hence we can represent 
a proof of A V B as a term inl a where a is a proof of A or inr b where b is a proof 
of B. Afterwards, we can analyze the structure of a proof of A V B to see whether 
it is of the form inl a or inr b. This is a first example of pattern matching, which 
will be discussed later. 

A — > B To prove A — > B, we must be able to transform an arbitrary proof of A into a 
proof of B. Hence a proof of A — > B is a description of a program that performs 
this transformation. If we are given a proof a of A and a proof / of A — > B, we can 
apply / to a to get a proof / a of B. 

T The trivial proposition T has one proof that we denote by tt. 

_L The absurd proposition _L has no proof. If we would have a proof b of _L, then we could 
construct a proof absurd^ b of any proposition A. 

-iA We define -<A to be equal to A — >■ ±, i.e. a proof of -<A is a program that transforms 
a proof of A into a proof of _L. 

We defined a proof of A — > B as a program that transforms proofs, which are rep- 
resented as data. What do we mean when we say 'a program'? To give a real proof of 
A — > B, a program must ... 
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... be defined for each a of type A, 

... never get stuck, 

... give a result after a finite number of steps, 

... give a valid proof of B result, 

... always give the same result for the same input. 

We will represent this kind of proof-transforming programs by lambda expressions. 
A lambda expression of type A — > B is of the form X(x : A), b where b : B and x is a 
variable of type A that can occur freely in b. To apply this expression to a proof a of A, we 
must substitute a for x in b. The result of this substitution is written as b[x >->■ a]. Thus 
a program in type theory is a function, represented as a term in the form of a lambda 
expression. 

As an example, we will give a proof of (((Ay B) — > C) A A) — > C. To do this, we must 
transform a proof of ((A V B) ->■ C) A A to a proof of C. A proof of ((A V 5) ->■ C) A A 
has two components; the first component is a proof f st p of (A V 5) — > C and the second 
component is a proof snd p of A. We can now construct a proof inl (snd p) of Ay B 
and hence we can apply the function f st p to this to get a proof of C . The full proof of 
(((A V B) ->■ C) A A) ->■ C looks like this: 

A(p : ((A V £) ->■ C) A A)), (f st p) (inl (snd p) 
2.1.2 Type theory as a programming language 

If we regard the propositions and proofs we have defined as terms and types instead, we 
already have a (primitive) programming language: A — > B is a type of functions, A A B 
is a type of pairs and A V B is the disjoint union of A and -B. In a programming context, 
we denote A A B as the product Ax B and AVB as the sum A + B of the two types A 
and -B. This language with — >, x and + is called the simply typed lambda calculus. 

To write real programs, we need some basic types such as booleans or natural numbers. 

Bool We define a new type Bool with two canonical terms true : Bool and false : Bool. 
For any type A with terms x, y : A and a boolean b : Bool we define the term 
if b then x else y, which evaluates to x if b — true or to y if b = false. This 
allows us to define functions by case distinction. 

Nat We define a new type Nat with one base term zero : Nat and a successor function 
sue : Nat — > Nat. For example, the number 2 can be represented as sue (sue zero). 
For any type A with a term z : A and a function s : A — > A and a natural number 
n : Nat we define natrec n z s : A, where natrec n z s equals s (s . . . (s z)) (s 
applied n times to z). This allows us to define functions by natural induction. 

For example, we can define the function iszero : Nat —t Bool as follows: 

iszero = An. natrec n true (Am. false) 

To use type theory as a programming language, we need a concept of computation on 
terms. This is done by giving evaluation rules, for example fst (a, b) — > a. Figure 2.1 
gives an overview of the evaluation rules for the terms we have considered so far. 
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(A(a : A), b) a' — > 6[a M- a'] 

f st (a, 6) — >■ a snd (a, 6) — >■ b 

if true then re else y — > x if false then x else y — > y 

natrec zero z s — > z natrec (sue n) z s — > s (natrec n z s) 



Figure 2.1: Evaluation rules for the simply typed lambda-calculus. 



Together with these evaluation rules there are a number of congruence rules that 
allow us to apply the evaluation rules to arbitrary subterms. For example, if ti — > t[ 
and t 2 — > t' 2 , then we also have t\ t 2 — > t[ t 2 and t[ t 2 — > t[ t 2 . Together these rules 
define a binary relation — y on the set of all terms. The reflexive and transitive closure 
of this relation is written as — so we have t — >* s if and only if there exist terms 
to, . . . ,t n such that t — to — > ■ ■ ■ — > t n = s. 

A term t is in normal form if there exists no term s such that t — > s. If t — >* s 
with s in normal form, then we say that s is a normal form of t. A normal form of a term 
is (in a sense) the simplest form of this term, so it is often seen as giving the meaning of 
the term. 

A term in which all variables are bound by lambda expressions is called a closed term. 
A very important property of type theory is that every closed term has a unique normal 
form. It follows that we can check the equality of two terms by evaluating them to normal 
form and comparing their normal forms. These and other meta-theoretical properties of 
type theory are discussed in more detail in section 2.5. 



2.1.3 Constructive mathematics in type theory 

Type theory can also be used as an alternative to set theory, where we interpret types 
as sets and terms as elements of that set. So far we have only constructed sets without 
internal structure. It is possible to add extra structure to a type, for example that 
of a group, a topological space, a category, . . . This allows us to formulate and prove 
propositions about those objects, and have these proofs automatically checked by the 
computer. Moreover, the proofs we give have a computational content, in the sense that 
a proof of "there exists an x such that y" contains a program that can construct the x 
mentioned in the theorem. Very recently, a deep theorem from group theory called the 
Odd Order Theorem, which states that every finite group of odd order is solvable, has 
been fully formalized in type theory using the Coq proof assistant [Gonl3]. 

As an example of how this extra structure on a type can be defined, we can take two 
points pi,p 2 : P and define a new type pi ~* p 2 of paths from p 1 to p 2 . These types are 
studied in a rich and active area called homotopy type theory. A good introduction to this 
fascinating subject can be found in [Rij 12] . 



2.2 What are dependent types? 

From the logic perspective, the types we have introduced so far correspond to propositional 
logic. We will now introduce new types that correspond to predicate logic. For example, 
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we want to be able to express and prove the following proposition: 
"For each natural number n, sue n is not equal to zero." 

For this, we need types that depend on a value. For example, we can define a type 
IsZero n, which depends on the value of the natural number n. The only term of a type 
of the form IsZero n is the term iszero of type IsZero zero. This term iszero is called 
a constructor. So for each natural number n we have a concrete type IsZero n, which is 
non-empty if and only if n is equal to zero. 

Another example of a dependent type is Square n that expresses the proposition "n 
is a perfect square". The terms are of the form sq n, which has type Square n 2 . Hence 
for all m : Nat for which m is not a perfect square, Square m is an empty type, i.e. a 
proposition without proof. 

In general, a dependent type is a family of types indexed by a base type. Some types 
of the family might be empty, others might be not. Hence a dependent type expresses a 
certain property the terms of the base type may or may not have. 

Dependent types are not just used for formulating propositions, but also for program- 
ming. For example, we can define a dependent type Fin n that has exactly n elements 
for each n : Nat. This is a dependent type over the base type Nat. This type is useful for 
indexing a list of length n, because it guarantees us that indices which are too big will 
never be used. 

2.2.1 The dependent product 

Suppose we have a dependent type D a over a base type A. We introduce a new type 
V(a : A). D a that expresses that D a is inhabited (nonempty) for all a : A. This type is 
called a dependent product. A term of this type is a program that gives a term of type 
D a on input A, i.e. a lambda expression X(x : A), p where p : D a can contain free 
occurrences of the variable x. For example, we can translate the proposition "For each 
natural number n, sue n is not equal to zero" to type theory as the type: 

V(n : Nat). -i(lsZero (sue n)) 

Remark that V(a : A). D a is a generalization of the type A — >■ B where the simple type B 
has been replaced by a dependent type D a. For this reason we often write (a : A) — > D a 
instead of V(a : A). D a. The terms of this type are functions of which the type of the 
result depends on the value of the concrete argument. In contrast to the regular function 
type A — > B, this is a feature that does not occur in most typed programming languages. 

We can also interpret V(a : A). D a as the (possibly infinite) product of all types D a 
for a : A. A term t : V(a : A). D a can be seen as a tuple and its component at index a 
is given by t a. This explains why V(a : A). D a is called the dependent product. 

2.2.2 The dependent sum 

Analogously to the dependent product, we also want a type expressing that D a is inhab- 
ited for at least one a : A. So we define the type El (a : A). D a, which is called a dependent 
sum. A term of type El (a : A). D a is given by a pair (a,p) where a has type A and p 
has type D a. So to give a proof of El (a : A). D a, we have to construct an actual term a 
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satisfying D a, and this term can be reconstructed from the proof of El (a : A). D a. This 
is not possible in classical logic, and that is one of the reasons why type theory is called 
a constructive logic. 

On one hand, we can see 3(a : A) .D a as a, generalization of the regular product Ax B 
where the type of the second component depends on the value of the first component. 

On the other hand, we can see El (a : A). D a as a disjoint union of the family of all 
types D a. In this case the first component is only a label that assures that the union is 
disjoint. For example, 3(n : Nat). (Fin n) is the disjoint (infinite) union of all the finite 
types Fin n. 

A third possible interpretation of El (a : A). D a is as a subtype of A consisting of all 
a for which D a is inhabited. For example, 3(n : Nat). (Square n) is the subtype of all 
natural numbers which are a perfect square. 

2.2.3 The identity type 

Another important proposition we want to express in logic is equality between two objects. 
To do this, we introduce for each type A and terms x, y : A an identity type x = A y, which 
is inhabited if x and y evaluate to the same expression. The only canonical term of this 
type is ref 1 x, which has type for any x : A. This is called ref 1 because it 

expresses the reflexivity of the relation = A . 

We know how to construct a term of type x = A x, but what can we do with a given 
term of type a = A b? If we have a term p of type a = A b, then we know that a and b 
are equal, hence we should be able to substitute b for a in any term. This is essentially 
Leibniz' law that says that equals can be substituted for equals. Let C x be a dependent 
type over the base type A, then we denote this substitution as J b p : C a — > C b. If p is 
of the form ref 1 a, then b must be equal to a by definition of ref 1 and hence we have 
the evaluation rule J a (ref 1 a) c — > c. Remark that we don't have Jape — > c for 
arbitrary p since this would not be type-correct. 

This substitution rule can be generalized so that the type C is not only dependent 
over a : A but also over p : a = A b. Let A be a type, a : A, and C be a dependent type 
indexed over x : A and p : a =a x. The full type of J then becomes: 

J : (6 : A) ->■ (p : a = A b) ->• C a (ref 1 a) ->■ C b p (2.1) 

By using this substitution rule it is possible to give terms 

sym : (x : A) -> (y : A) -> x = A y -> y = A x (2.2) 

and 

trans : (x : A) ->■ (y : A) ->■ (z : A) — >■ x = A y — >■ y = A z — >■ x = A z (2.3) 

that prove that the relation = A is symmetric and transitive (see for example [Rijl2] p.15- 
16). In other words, = A is an equivalence relation on the terms of type A. 

It is surprising that with only this substitution rule it is impossible to prove that 
ref 1 x is the only proof of x = A x. In other words, it is impossible to give a term of type 
(p : x =a x) — > p = x = A x (ref 1 x). In homotopy type theory [Rijl2], x = A y is interpreted 
as a type of paths from x to y, where ref 1 x is the trivial path; standing still at x. Hence 
the possibility of other terms of type x = A y is essential for the development of homotopy 
type theory. 
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2.2.4 Universes 

We can regard a dependent type over a base type A as a function from A to types. To 
formalize this intuition, we need to introduce universes. A universe is a type of which the 
terms are types themselves. The archetypical example of a universe is the universe Set, 
which contains all base types we have seen so far: Bool, Nat, T, _L, and all types we can 
form by starting from these and using A, V, — >, V, 3, and =. The types in the universe 
Set are called the small types. 

A dependent type over a base type A can be seen as a function of type A — > Set; 
for example IsZero is a function of type Nat — > Set. The symbols V and 3 are in fact 
themselves terms of type (A : Set) — >■ (A — > Set) — > Set, and the identity type has type 
(A : Set) -> (a: A) -> (b : A) -> Set. 

So if Set is itself a type (of types), can we say that Set : Set? It turns out that it 
is not possible to do this in a consistent way, because doing so leads to problems similar 
to Russell's paradox from set theory. We will instead introduce a hierarchy of universes 
Set 0 = Set, Seti, Set 2 , . . . where Set 0 : Seti, Seti : Set 2 , etc. Each of these universes us 
closed under type constructors such as V and 3, as in the definition of Set. 

2.3 Formal description 

This section formally describes the core type theory we will use. It is based on UTT 
(Unified Theory of Dependent Types) [Luo94] . This version of type theory only contains 
(dependent) function types and universes. All remaining types can be defined using 
inductive families, as described in the next section. The original description is made in 
a meta-level logical framework. Since we use UTT more as a tool here and not as the 
subject of study, we will not be quite so formal here. UTT also contains a separate 
impredicative universe of propositions, which we also omit because it is not needed for 
pattern matching. 

2.3.1 The building blocks 

Contexts 

When formally describing type theory, we want in general be able to say that a term t 
has a certain type T under a set of assumptions on the types of the free variables in t. 
We call such a set of assumptions a context. We write a context as a list of variables with 
a type annotation 

(xi : Xi)(x 2 : X 2 ) . . . (x n : X n ) 

where, in general, the variables x\, . . . , Xi-i can occur in the type X«. This allows the 
type of an assumption to depend on a previous assumption, for example in the context 
(n : Nat)(s : Square n). Another example of a context is (A : Set)(x : A)(y : A)(p : x =a 
y). An arbitrary context is usually denoted by a greek capital letter T, A, ... In order to 
avoid inconsistencies, we allow each variable to occur only once in each context. 

Contexts are multifunctional: we can also use them as the type of a list of terms. 
We extend the typing relation x : T to a relation between lists of terms and contexts 
as follows: t 1 t 2 ... t n : (xi : Xi){x 2 : X 2 ) . . . (x n : X n ) if and only if t\ : X 1 and 
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var — x | y \ z | ... 
term = var 

| \{binding). term 
| term term 
| {binding) — >■ term 
| Seto | Seti | Set 2 | 
binding = var : term 
context = e 

| context{binding) 



variable 
lambda abstraction 
application 
dependent product 
universes 

empty context 
extended context 



Figure 2.2: Syntax of UTT 

i 2 ... t n : (#2 : -^2) • • • ( x n '■ X' n ), where X[ = Xi[x\ >->■ ti]. For example, we have 

9 (sq 3) : (n : Nat)(s : Square n) 

and 

Bool true true (ref 1 true) : (A : Set)(x : A)(y : A)(p : x =a y) 

If we have a context T = (xi : Xi)(a; 2 : X 2 ) . . . (rr n : X„) and 5 is a type 
with free variables x 1 ,...,x n , then we can form the iterated dependent product 
(#i : Xi) — >■ (x 2 : X 2 ) — >■ ... — >■ (x n : X n ) — >■ fi. Because it quickly becomes tedious 
to write all these arrows, we will write this type as (xi : X 1 )(x 2 : X 2 ) . . . (x n : X n ) — >■ i? 
or simply T — > _B. This notation for the iterated dependent product is called the telescope 
notation. 

We will also sometimes use a context to stand for the list of variables declared in the 
context. For example, if / : T — > T then we have T \- f T : T. 

Syntax 

In figure 2.2, we give the formal syntax of (our version of) UTT. Remark that terms and 
types both belong to the same syntactic class term. This is our way to allow types to be 
terms themselves (i.e. types can be part of a universe). The types are simply those terms 
that have type Seti for some i. 

Another way to work with universes, which is used in the original description of UTT, 
is to define for each term t in a universe U a type El(t). This has the advantage that 
there is a clear syntactic distinction between terms and types, but we believe our approach 
results in a simpler and more uniform theory. 



Definition 2.1 (Bound occurrence, closed term). All occurrences of a variable 
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Definition 2.2 (Free variables). If an occurrence of a variable is not bound, it is 
called free. The set of all variables that occur freely in an expression e is denoted as 
FV(e) 




Strictly speaking, it only makes sense to say that a particular occurrence of a variable 
is free or bound. But if it is clear which occurrence of the variable is meant (e.g. because 
there is only one, or because it doesn't matter which one), then we say that the variable 
itself is free or bound respectively. 

If two terms have the same syntactic structure, then clearly it makes no sense to 
distinguish between them. But if two terms only differ in the names of their bound 
variables - for example \(x : Seto). x and \(y : Set 0 ). y - then we still regard these as 
having the same meaning. This is expressed in definition 2.4. 

Definition 2.4 (Syntactic equality). If two terms have the same syntax up to re- 



naming of bound variables, we call them syntactically equa 
We will never distinguish between two syntactically equal terms. 



Substitutions 

We denote the simultaneous substitution of the terms ti, . . . , t n for the variables xi, . . . , x n 
as [xi i — y ti,...,x n i — y t n ]. We require that ti ^ xi to avoid pathological cases. In 
particular, the identity substitution is denoted as []. The application t[x\ h-> ti, . . . , x n \-> 
t n is defined as the term t where the free occurrences of x±, . . . , x n have been replaced by 
ti, . . . , t n respectively. We assume substitutions rename bound variables as needed to avoid 
variable capture. This renaming is always possible because we assume an infinite supply 
of fresh new variables and because our definition of syntactic equality allows renaming of 
bound variables. 

The domain of a substitution a = [xi h-> ti,...,x n h-» t n ] is the set of variables 
dom(a) = {xi, . . . , x n }. If we have two substitutions a and r with disjoint domains, then 
we can compose these two substitutions. This is written as a; r (first a, then r) and is 
done by applying r to all terms of a, and then adding the two lists together: 

[xi h-> h, . . . ,x n !->■ t n }; [y 

= [x 1 txT, . . . , x n H> t n r, yi !->• si, . . . , y m H> s m ] (2.4) 
(where r = [y : !->■ s u . . . , y m H> s rn ]) 

Remark that t(a; r) is syntactically equal to (ta)r for all terms t, as would be expected 
from a composition. 

If T is a context and t is a list of terms such that t : T, then we denote [r (->■ t\ for the 
substitution that maps the variables in T to their corresponding terms in i. 



Judgments 

Judgments are the things we can say formally to describe properties of a formal system, 
in this case type theory. In our description of type theory, we will use the following three 
judgments: 
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T valid means that T is a valid context containing a number of hypotheses. 

r h a : A means that we can deduce that a has type A from the hypotheses in the context 

r. 

r h a = b : A means that we can deduce that a and b are equal terms of type A from the 
hypotheses in the context T. 

Remark that these judgments are the only things we can formally say about our 
system. For example, the would-be judgment e h zero = false isn't true or false, it is 
simply meaningless because we can only make judgments about the equality of two terms 
if they have the same type. 

2.3.2 Inference rules 

To define a formal system such as type theory precisely, we use the system of natural 
deduction. In natural deduction, a formal system is described by a set of inference rules. 
An inference rule tells us how we can derive a conclusion from a set of hypotheses. An 
inference rule is written as 



where Hi, ... , H n are the hypotheses and J is the conclusion. We can compose inference 
rules by replacing a hypothesis by another inference rule that has that hypothesis as a 
conclusion. By combining inference rules like this, we can build a proof tree. If there is 
a proof tree with in its leaves the judgments Hi, ... , H n and at the root the judgment J, 
then we say that J is derivable from Hi, ... , H n . 

Valid contexts 



Hi 



H 



n 



J 



e valid 



r h A : SeU 



x £ FViY) 



F(x : A) valid 



Assumption rule 



r valid 



x : A G T 



r hx : A 



General equality rules 



r ht : A 



r ht = t : A 



r h ti = t 2 : A 
r h t 2 = ti : A 



r h ti = t 2 ■■ a 



r h t 2 = t 3 : A 



r h h = t 3 : A 
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Subsumption rule 

r h t : A x r h A x = A 2 : SeU 



r h t : A 2 

List of terms in a context 

T valid 



T h e : e 

rhtt: (x: A) A 

T\-t = s:A T \- i = s : A[x ^ t] 
r h t i = s s : (x: A) A 

Formation rule for Seti 

T valid 



T h Seti : Seti + i 



Formation rule for the dependent product 

r h A : SeU T(x: A) h 5 : Set, 



T \- (x : A) ^ B : Set mSLx(id) 

r h Ai = A 2 : ^ r(x : Ai) V B X = B 2 : Setj 
r h (x : Ai) Bi = (x : A 2 ) ->■ -B 2 : Set max ( itj) 



Introduction rule for A-abstractions 



Application 



r(rr : A) h t : B 
X(x: A).t: (x: A) ^ B 

r h Ai = A 2 : .Setj r(x : 4i) h ti = i 2 : -B 
r h A(x : Ai). = A (a: : A 2 ). t 2 : (x : ~M) -+ B 



r h / : (x : A) ->■ 5 Tht:A 
rh/t:B[mi] 

[U^MHj r h ti = t 2 : a 
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/3-equivalence 

T(x : A) h t : B T h s : A 
(X(x : A), t) s = t[x ^ s] : B[x >->■ s] 

^-equivalence 

r h / : (x : A) ->■ 5 
A(rr : A), f x = f : (x : A) ^ B 

2.3.3 Evaluation 

The rules given in the previous section tell us which terms have which types and when 
two terms are equal, but they don't tell us anything about the computational meaning of 
a term. This computational meaning is given by an evaluation rule. 

Definition 2.5 (Evaluation rule). An evaluation rule is a binary relation — > on 
terms satisfying the following properties: 

• If s — > t, then s u — > t u for all u. 

• If s — > t, then f s — > f t for all f . 

• If s — > t, then \(x : A), s — > \(x : A), t for all A. 

• If A — > B, then X(x : A).t — > X(x : B). t for all t. 

• If A — > B, then (x : A) C — > (x : B) — > C for all C. 



Definition 2.6 (/3-evaluation). The (3-evaluation — >p is the smallest evaluation 
rule that satisfies (X(x : A), t) s — ^ t[x >->■ s] for all s and t. 

Definition 2.7 (r^-evaluation) . The rj- evaluation — > v is the smallest evaluation 
rule that satisfies X(x : A), f x — > v f whenever x is not free in f . 

For any evaluation rule — >, we define — >* as the reflexive and transitive closure of 

Definition 2.8 (Normal form). A term t is a normal form with respect to the eval- 
uation rule — > if there is no term s such that t — > s. A term t' has normal form t 
with respect to — > if t' — >* t with t a normal form. 

We will usually drop the subscript in — >p and — \ T] and work with an evaluation rule 
— y that includes both (3- and 77-evaluation. Later on we will expand this evaluation rule 
for functions defined by pattern matching. 
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2.3.4 Equality in type theory 

We have seen three different notions of equality between terms of type theory: syntactic 
equality, definitional equality, and propositional equality: 

• Two terms s and t are syntactically equal if they look the same, i.e. if they have 
the same syntactical structure up to renaming of bound variables. 

• Two terms s and t are definitionally equal (or convertible) if they are equal according 
to the evaluation rules, i.e. T \- s = t : T where T is the context of free variables of 
s and t. 

• Two terms s and t are propositionally equal if we can prove their equality in type 
theory, i.e. if there is a term p such that r h p : s =x t where T is the context of 
free variables of free variables of s and t. 

When we write s = t in an argument, we mean that s and t are definitionally equal (unless 
noted otherwise). For example, we can say that (X(x : A), x) a — a. 

When working with type theory in dependently typed languages such as Agda or Coq, 
it is more convenient to have definitional equalities rather than propositional equalities. 
This is because definitional equality can be checked automatically, while propositional 
equality has to be proven manually. This is especially true when reasoning about open 
terms. 

The equalities are ordered from finest to coarsest: syntactic equality implies defini- 
tional equality implies propositional equality. It is clear that syntactic equality is strictly 
stronger than definitional equality: any two terms of the form (A(x : A).t) s and t[x K s] 
are definitionally equal but not syntactically equal. Definitional equality is also strictly 
stronger than propositional equality: the terms if x then true else true and true are 
propositionally equal but not definitionally equal, since if x then true else true doesn't 
evaluate to true. 

It can be startling that equality in type theory does not correspond to the intuitive 
notion of equality from set theory. This is because equality in set theory is extensional: 
two functions are equal if they give equal values for any closed argument. In contrast, 
equality in type theory is intensional: two functions are equal if they have the same 
definition. 

The definitional equality is intensional. Indeed, there is no general way to derive 
T \- f = g : A — >T from the fact that r h / a = g a : T for all closed terms a. 

The propositional equality is also intensional because it is not always possible to 
construct a term of type / =a-^t 9 from a collection of terms p a : f a =t g a for each 
closed a : A 

Of course, the syntactic equality is extensional: if / a and g a are syntactically 
equal for all closed terms a, then / and g are syntactically equal. But the syntactic 
equality is too weak for most situations: it does not even satisfy simple equalities such as 
(X(x : A), x) a = a. 

2.4 Inductive families 

An important aspect of type theory is the fact it is an open system: it is possible to add 
new types and functions to the theory, without invalidating older results. This allows 
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type theory to be used in many different ways, while staying small and elegant enough 
for theoretical study. 

So far, we defined a number of types by hand. Each time we add a new type like this, 
it is possible it will make the whole system inconsistent as a logic. To avoid this, we have 
to check that the new type cannot give us more than we put into it. For example, for the 
type A A B we have a function f st : A A B — > A. This is allowed because the only way to 
construct a term of type A A B involves giving an a : A. For the function f st, this was 
still quite easy to check, but in general this becomes harder. 

To avoid inconsistencies, we will equip type theory with a general method to define 
new types. This method will always result in good types that do not break consistency. 
It is also general enough to define all types we have seen so far, with the exception of 
the function types (a : A) — > B and the universes Seti. We call them inductively defined 
families of data types or simply inductive families. Inductive families are discussed in 
detail in [Dyb94]. 

2.4.1 Examples 

An inductive family is a dependent type defined by a set of constructors. Intuitively, 
constructors tell us how to build new canonical terms of the inductive family. The word 
'inductive' reflects the fact that the constructors are essentially the only way to construct 
such terms. This fact will be exploited when we describe pattern matching in the next 
chapter. The archetypical example of an inductive family is the dependent type Vec A n 
of vectors of A's of length n. There are two constructors for this family: 

e : Vec A zero 

is the empty vector of length 0 and 

cons : (n : Nat) (a : A)(v : Vec An) — >■ Vec A (sue n) 

allows us to add one element a in front of a vector v of length n in order to construct a 
vector of length n + 1. 

Suppose that we want to define an inductive family D such that r h D : A — > Seti. 
We call the variables in T the parameters of the family and the variables in A the indices. 
The difference between them is that the parameters are constant throughout the family, 
while the indices can vary from constructor to constructor. In the example of Vec A n, 
there is one parameter A : Set and one index n : Nat. Each constructor c of D must have 
a type of the form $ — > D % where T$ h % : A. For example, for the constructor e we 
have $ = e and i = zero, while for cons we have $ = (n : Nat) (a : A){y : Vec A n) and 
% = sue n. Remark that the type D may occur in the context $. There is a restriction 
on where this may happen, this will be discussed in the next section. 

We can define many of the types we saw before as inductive types: 

Nat is an inductive family with no parameters, no indices, and two constructors zero : Nat 
and sue : Nat — > Nat. 

A A B is an inductive family with two parameters A and B, no indices, and one construc- 
tor (.,-): A-> B ->■ AAB. 
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3 (a : A). D a is an inductive family with two parameters A and D, no indices, and one 
constructor (•, •) : (a : A) — >■ D a — > 3(a : A). D a. Remark that it is no problem 
for two different inductive families to share the same constructor name. 

Square n is an inductive family without parameters, with one index n : Nat, and one 
constructor sq : (n : Nat) — > Square n 2 . 

x =a y is an inductive family with one parameter A, two indices x, y : A, and one con- 
structor ref 1 : (x : A) — > x =a x. It is also possible to see the first index i as a 
parameter instead, this doesn't change much to the meaning of the definition. 

We can also use inductive families to define new properties and relations on existing 
types. For example, we can define an inductive family n <m with indices n, m : Nat and 
two constructors 

equal : (n : Nat) — >■ n <n 
smaller : (n : Nat)(m : Nat)(p : n < m) — > n < (sue m) 

2.4.2 Strict positiveness 

In the definition of an inductive family D, it is allowed that D occurs in the type of 
the arguments of its constructors. For example, Nat is the type of the argument of the 
constructor sue : Nat — >■ Nat. But allowing this in general would cause severe problems. 
To see where the problem lies, consider the type Bad with just one constructor 

bad : (Bad ->• ±) Bad 

We claim that this definition leads to a contradiction. To show this, we will define a func- 
tion / : Bad — > _L. If we can do this, then we have bad / : Bad and hence / (bad /) : _L. 
So just by defining the type Bad, we have introduced a closed term of type _L, which 
is supposed to be empty. The logical interpretation of the type Bad makes the problem 
especially clear: it says "The proposition Bad is true if and only if Bad is false." The 
'if part holds because of the type of the constructor bad, while the 'only if part holds 
because Bad is the smallest type that contains all terms bad x. It is no wonder that 
definitions like this lead to inconsistencies. 
Here is the definition of the function /: 

/ (bad x) — x (bad x) 

This definition uses pattern matching, which we haven't formally defined yet, but we hope 
the idea is clear. Remark that 

/ (bad /) — ► / (bad /) 

so evaluation of this term will never terminate, i.e. / (bad /) does not have a normal 
form. 

To avoid the definition of problematic types like Bad, we will put a condition on the 
types of the arguments of constructors. If c : $ — y D % is a constructor of D, then D may 
only occur strictly positively in each type of $. We say that D is strictly positive in a 
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type T if it only occurs to the right of every arrow — y in T. This rules out constructors 
like bad : (Bad — y _L) — y Bad because Bad occurs to the left of an arrow in the argument 
type (Bad ->■ _L). 

Definition 2.9 (Strictly positive occurrence). A (type) variable X is strictly 
positive in the type T if X does not occur in T or if T is of the form * — y X u 
where X does not occur in*. 

2.4.3 Definition 

Now we can give the formal definition of inductive families. 

Definition 2.10 (Inductive family). Let T, A, be contexts and 

Ti, . . . ,T n be lists of terms such that TA valid and T(D : A — y Seti)§k h j fc : A for 
k = 1, . . . , n. Suppose D only occurs strictly positively in $i, . . . , $ n . Then we can 
define an inductive family with constructors ci, . . . , c„ by the following rules: 

r' valid 

rT h D : A -> Sett 

r' valid 

TT h c fc : $ fc ->■ D i k 

Constructors tell us how to construct terms of an inductive family, but not how to use 
them. Classically, the definition of each inductive family is accompanied by an induction 
principle that describes how terms of that type can be used. While it is possible to write 
programs and proofs using these induction principles, it is very tedious to do so in real 
applications. The alternative is to extend type theory with pattern matching, which is 
what we will do in the next chapter. 

2.5 Fundamental theorems 

In this section, we will describe some of the most important theorems of type theory. 
These theorems are important for several reasons. One reason is that type theory can 
also be used as a logic, so we cannot allow e.g. infinitely long proofs. Another reason is 
that these results are needed to keep type checking decidable in the presence of dependent 
types. Most of the theorems given here are proven in [Gog94], which gives a thorough 
description of the metatheory of UTT. 

The first theorem tells us that each term can have at most one type. 

Theorem 2.11 (Unique types). If T h t : T\ and T h t : T 2 then T h 7\ = 
T 2 : Seti. 

Proof. Corollary 6.8.1 of [Gog94]. □ 

Using the fact that each term has a unique type and the typing rules of the previous 
chapter, we can use the form of the term to learn something about the form of its type: 

• If a term of the form X(x : A), t has any type, then it is of the form (x : A) — y B. 
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• If a term of the form (x : A) — > B has any type, then it is SeU for some i. 

• Seti always has type Set i+1 . 



The second theorem tells us that the type of a term is preserved under evaluation. 
This is essential because t — > t' implies that t and t' represent the same object, and that 
t' is just a 'simpler' representation of this object than t. 




Proof. Corollary 6.8.5 of [Gog94]. □ 

The next theorem tells us that there can be no infinite evaluation sequences. For type 
theory as a programming language, this means that well- typed programs will always halt. 
For type theory as a logic, this means that there can be no infinitely long proofs. 

Definition 2.13 (Strong normalization). An evaluation rule — > is strongly nor- 
malizing if there is no infinite sequence of terms such that to — > h — > t 2 — > . . . 




Proof Corollary 6.8.4 of [Gog94]. □ 

Theorem 2.16 assures us that no matter which evaluation rules we apply first, we will 
always get the same result. This allows us to speak about the normal form of a term. 



Definition 2.15 ( Church- Rosser property). An evaluation rule — 
Church-Rosser property if the following holds: if for any term t we have t - 


> has 
->* t x 


the 
and 




Theorem 2.16 (Church-Rosser property). The (3rj -reduction — > 
Church-Rosser property. 


has 


the 


Proof. Corollary 6.8.6 of [Gog94]. 




□ 


Corollary 2.17 (Unicity of normal forms). Two terms are definitionally equal if 
and only if they have the same normal form. 



This corollary implies that definitional equality of two terms of a given type is decidable. 

As mentioned at the start of section 2.4, terms of an inductive family can be con- 
structed by using its constructors, and only by using its constructors. We will now turn 
this into a precise statement. 



Definition 2.18 (Constructor form). A term t is in constructor form if it is of 
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Definition 2.19 (Completeness). An evaluation rule — > is complete if the fol- 
lowing holds: for any inductive family D and any closed term t : Di in normal form 
we have that t is in constructor form. 

Lemma 2.20. If t is a closed normal form of type (x\ : A\) . . . (x n : A n ) — > 

D ii ... i n then t is either of the form \(x : A).u or c t± ... t m where c is a 
constructor of the family D. 

Proof. By theorem 2.11 (unicity of types), t is either a variable, a lambda abstraction 
\(x : A), u, an application u v, or a constructor c. 

• Because t is a closed term, it cannot be a variable. 

• If t is a lambda abstraction then we have proven the statement. 

• If t is an application u v, then u must also be a normal form. By induction on the 
size of t, we have that u is either of the form A(x : A), s or c u x ... u m . But in the 
first case we have that t = (X(x : A), s) v is not a normal form, a contradiction. So 
we must have t — c u\ ... u n v, which is of the correct form. 

• If t is a constructor c, then it is also of the correct form. □ 

Theorem 2.21. The f3i]- evaluation rule — > is complete. 

Proof. Let t : D i be a closed term in normal form. Lemma 2.20 gives us that t is of the 
form \(x : A), u or c t 1 ... t m . But if t is a lambda abstraction, then it would have a type 
of the form (x : A) — >■ B, a contradiction. So t is of the form c t\ ... t m for a constructor 
c of the family D, as we wanted to prove. 

Combining the above theorem with theorem 2.14, we get that any closed term t in an 
inductive family evaluates to a constructor form. So constructor forms are the 'canonical 
forms' of an inductive family. We can interpret this theorem as a progress theorem for 
type theory: the evaluation of a well-typed term never gets stuck. 



Chapter 3 
Pattern Matching 



Terms in constructor form play a special role in inductive families. Indeed, theorem 2.21 
tells us that all closed normal forms of an inductive family are in constructor form. So in 
order to define a function taking inputs from an inductive family, it is sufficient to define 
it for all inputs in constructor form. For example, to define a function / : Nat — > T, it is 
sufficient to define / zero and / (sue n) for an arbitrary n : Nat. This is the principle on 
which pattern matching is based. 

The kind of pattern matching with dependent types described in this chapter was 
first described by T. Coquand in [Coq92]. A useful extension of pattern matching (which 
is not described in this thesis) are 'with'-clauses as described in [MM04]. It has also 
be implemented in multiple dependently typed languages, such as DML 1 [Xi03], Agda 
[Nor07], and Coq [SozlO]. 

3.1 Simple pattern matching 

We will start by describing pattern matching where all argument types are simple (i.e. 
non-dependent) types. The result type can be a dependent type, however. This will allow 
us to discuss important facets of pattern matching before introducing complications like 
inaccessible patterns and absurd patterns, which are needed to deal with dependent types. 

3.1.1 Patterns and clauses 

A pattern is a special kind of term that consists of only constructors and free variables. 
These free variables in a pattern are called the pattern variables. The pattern variables in 
a pattern are usually required to be distinct, if this condition is satisfied then the pattern 
is called linear. Suppose we have a pattern p and an arbitrary term t, then we say that t 
matches p if there is a substitution a with as its domain the pattern variables of p such 
that t = pa. This is written as MATCHED, t) =>- a. If there is no such substitution, we 
write match(p, t) f|\ Figure 3.1 describes an algorithm that determines whether a term t 
matches a pattern p. This is not hard because patterns have a simple tree-like structure. 
The algorithm described by these rules is called pattern matching. Remark that it is 

1 DML is no longer under development, but the same kind of pattern matching is used in its successor 
ATS. 
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MATCH(a;,t) =^[l4<] 
MATCH (p, f) =>• (7 MATCH (p, t) f|~ 



MATCH(cp, Ci) =4> (7 MATCH(cp, C t) f 

gi 7^ C 2 

MATCH (ci p, C 2 t) it 

match (e, e) =4> [] 
MATCH(p, t) =4> a MATCH(p, t) =4> a" 



match (p p,t t) o;a' 
match(p, i) ff match(p, f) ff 



MATCH(p p, t t) f|~ MATCH(pp,t t) f|~ 

Figure 3.1: Rules describing the pattern matching algorithm. 

essential that the patterns are linear for this algorithm to work, otherwise the variables 
in the domain of the substitution would not be disjoint. 

To define a new function / : (a : A) — > B a, we give a set of equations of the form 
/ p = t where p is a pattern of type A and t is a term of type B p that can contain the 
pattern variables of p. Such an equation is called a clause of the definition, and the term 
t is called the right-hand side of the clause. Here is an example of a definition by pattern 
matching of the function pred : Nat — > Nat that calculates the predecessor: 

pred zero = zero , . 

pred (sue n) = n 

It is obvious how a definition by pattern matching works: to evaluate / s, we search for 
a clause f p — t such that match (p, s) =^> a. If we find such a clause, then we have 
f s = f (pa) — > to. 

It is allowed that the right-hand side of a clause contains recursive calls to the function 
we are defining. For example, 

double zero = zero , , 

double (sue n) = sue (sue (double n)) 

We can also define functions / : (a± : Ai) . . . (a n : A n ) — > B a\ ... a n with multiple 
arguments by working with lists of patterns. Each clause of / is of the form / p x ... p n — 
t. The linearity condition now implies that the pattern variables of all the patterns 
Pi, . . . ,p n should be different. For example, 

plus zero n = n , , 

plus (sue m) n = sue (plus van) 
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3.1.2 Required properties 

If we allow any definition by pattern matching, then we encounter a number of problems: 

• It might be that there is a closed normal form s that matches none of the patterns 
in the definition of /. Then / s doesn't evaluate any further, hence it is a normal 
form. Hence it is possible to break completeness theorem 2.21. 

• Because the right-hand side of a clause can contain recursive calls to the function /, 
it is possible that the evaluation of the function / never stops. For example, there 
can be a clause of the form f x = f (sue x). Hence it is possible to break the strong 
normalization theorem 2.14. 

• It is also possible that a term matches multiple patterns with unequal right-hand 
sides. Hence it is possible to break the Church- Rosser property 2.16. 

To avoid these problems, we will give a number of conditions that definitions by pattern 
matching must satisfy. Specifically, we require the following three conditions, correspond- 
ing to the three above problems: 

Completeness For each closed term s, there must be a pattern p in the definition such 
that s matches p. 

Termination There can be no infinite sequence of evaluation steps / a — > t\ — > t 2 — > 
. . . where / occurs in each of the U. 

Confluence If there are two clauses / p\ — t\ and / p 2 = t 2 and substitutions <7i, <t 2 
such that p\<J\ = P2C2, then t\<j\ and t 2 o- 2 must be equal. 

When these three requirements are satisfied, then we can safely add the new constant / 
to the theory, together with the evaluation rules given by the clauses of /. 

3.1.3 First-match semantics 

While the requirements for completeness and termination are conventional, our require- 
ment that definitions must be confluent is not. Instead, overlapping patterns are usually 
interpreted on a first-match basis: only the first matching clause is considered. This 
allows us for example to give a definition of equal : Nat — > Nat — > Bool as follows: 

equal zero zero = true 

equal (sue m) (sue n) = equal m n (3-4) 
equal m n = false 

Clearly, the last clause does not hold as a definitional equality but rather only holds when 
the first two clauses don't match. 

If we use the first-match semantics for overlapping patterns, it is always possible to 
give an equivalent definition without overlapping patterns. In our example, this can be 
done by replacing the last clause by these two clauses: 

equal zero (sue n) = false , ■. 

equal (sue m) zero = false 
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In this case there is only one extra clause in the expansion, but this number can in- 
crease sharply with the number of constructors. So the main advantage of the first-match 
semantics is that it allows us to give shorter definitions. 

In section 4.4, we will see how the first-match semantics can be used in the translation 
from a set of clauses to a case tree. In chapter 5, we will discuss a number of problems 
that are caused by this translation. This will lead to a different approach where we 
treat clauses with overlapping patterns as definitional equalities, hence the need for the 
confluence requirement. 

3.2 Inaccessible patterns 

When we want to generalize pattern matching to functions with dependent types in the 
argument position, some new complexities arise. For example, it will not be sufficient to 
use patterns built of only constructors and variables. Inaccessible patterns will fill this 
gap, by allowing us to exploit the full power of inductive families. 

Let's look at the following example: suppose we want to define a function 
root : (n : Nat)(p : Square n) — > Nat that, given a natural number n and a proof p 
that n is a perfect square, computes the square root of n. Remark that by definition of 
the type Square n, any canonical term of this type must be of the form sq m for some 
m with m 2 = n. But this is exactly the square root we are looking for! So we can simply 
pattern match on the proof p to get the square root of n. The definition of root consists 
of the single clause root n (sq m) = m. There is only one problem with this clause: it is 
not well-typed. The pattern sq m has type Square m 2 , while a pattern of type Square n 
is expected. These types are only equal if n = m 2 , so we might try to change the clause to 
root m 2 (sq m) = m. But this gives two new problems: the pattern is not linear because 
the variable m occurs twice, and m 2 is neither a constructor nor a variable. 

To deal with situations where there is only one type-correct term for a pattern variable, 
we introduce a new type of patterns, called inaccessible patterns. We write inaccessible 
patterns as |_^J where t is an arbitrary term. For example, we can now write the definition 
of root as root [m 2 \ (sq m) = m. The intuitive meaning of an inaccessible pattern \t\ 
is that t is the only type-correct term at this position in the pattern. 

So in general, a pattern is built up from pattern variables, constructors, and inaccessi- 
ble patterns. The linearity condition now requires that each pattern variable occurs only 
once in an accessible position. However, there is no limit on how many times it can occur 
in the inaccessible patterns. 

When doing pattern matching, inaccessible patterns seem to pose a significant problem 
because we have to match with an arbitrary term (not just a pattern). But this is not 
a real problem: because inaccessible patterns are only allowed when there is only one 
possibility, they are guaranteed to match whenever the other parts of the patterns do. 
For example, when calling the function root with arguments n and sq m we don't have 
to check whether n is equal to m 2 ; it is guaranteed by the type system. Hence, we can 
easily extend the algorithm for pattern matching by the rule given in figure 3.2. 
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MATCH ( |_sj , t) =>- [] 

Figure 3.2: Extending the pattern matching algorithm with inaccessible patterns. 

3.3 Formal description 

We will now give a formal description of definitions by pattern matching and the require- 
ments they have to satisfy. We give only the abstract definitions in this section; how these 
requirements can actually be checked by a computer is discussed in the next chapter. 

3.3.1 Definitions by pattern matching 

Definition 3.1 (Pattern). Let A be a valid context. Patterns (with pattern variables 
A ) are inductively defined as follows: 

• If (x : A) e A, then x is a pattern. 

• If c is a constructor of arity n and pi, ■ ■ ■ ,p n o,re patterns, then cp\ . . . p n is a 
pattern. 

• If Ah t : T (i.e. all free variables of t come from A ), then \t\ is a pattern. 



Definition 3.2 (Linear pattern). A list of patterns p with pattern variables A is 




Although patterns look like terms, they form a distinct syntactic class. Definition 3.3 
allows us to convert patterns back to terms. 



Definition 3.3 (Underlying term of a pattern). The underlying term \p] of a 
pattern p is defined as follows: 

• \x] — x 

• \cpi ... p n ] =C \p{\ . . . \p n ] 

The underlying term \p 1 ... p n ] of a list of patterns is the list of underlying terms 
fail ••• \Pn\- 

If p is a pattern with pattern variables A and TA h \p] : T, then we say that p is 
a pattern of type T and we write this judgment as T|A h p : T pattern. The vertical 
bar | separates the regular variables from the pattern variables. Similarly we write T|A h 
p : 0 pattern if TA h \p] : 0. We say that the patterns p\ and P2 are equal as terms if 
their underlying terms are equal. 
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Definition 3.4 (Pattern matching). A list of terms T h t : 9 matches a list of 
patterns T\A h p : 6 pattern if there exists a substitution a with domain A such 
that \p\ a — t. 



Definition 3.5 (Clause). The possible clauses for a function f : $ — > T are defined 
by the following inference rule: 

r|A h p : $ pattern F(f : $ -> T)A h i : T[$ ^ [p]] 
T h f p = t clause 



Definition 3.6. Let — > be an evaluation rule and C a set of clauses for f . We define 
— >c as the smallest evaluation rule extending — > that satisfies f (pa) — >c to- for 
each clause f p = t in C and each substitution a with domain the pattern variables of 
p. 



3.3.2 Conditions on the inaccessible patterns 

We have said that inaccessible patterns can only be used when there is only one possible 
type-correct term at that position. But this is not a very precise definition. In this section, 
we will define the notion of a valid pattern that gives a precise meaning to this intuition. 
First, we need two auxiliary definitions: 

Definition 3.7 (Skeleton). Let u be a fixed symbol which is not a variable or a 
constructor. The skeleton of a pattern is defined as follows: 

• The skeleton of a pattern variable x or an inaccessible pattern [t\ is always u. 

• The skeleton of a constructor pattern cp\ . . . p n is c s\ ... s n , where s\, . . . , s n 
are the skeletons of pi, . . . ,p n . 

In other words, the skeleton of a pattern remembers the constructors, but forgets all 
variables and inaccessible patterns. 

Definition 3.8 (Pattern specialization). Letp,q be patterns. We say that p D q 
(q is a specialization of p) if there exists a substitution 5 such that \q\ = \p]5 and all 
variables in the domain of 5 are pattern variables of p. 



It is easy to see that if p D q and a list of terms t matches q, then t also matches p. This 
implies that the set of lists of terms matching q is a subset of the lists of terms matching 
p, hence the notation. 

Intuitively, a pattern is valid if it is the most general pattern with the same skeleton. 
This idea is captured in definition 3.9. 

Definition 3.9 (Valid pattern). A list of patterns T\A h p : $ pattern is valid if 

for each other list of patterns T\A' h p' : $ pattern with the same skeleton we have 



How does this definition prevent 'bad' inaccessible patterns? Suppose a pattern p 
contains an inaccessible pattern [^J but there is another type-correct term t' for this 
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position with t' ^ t. Then we can also form a pattern p' that is equal to p except that 
[t\ has been replaced by \t'\ . Then p and p' have the same skeleton. However, there is 
no substitution a with in its domain only pattern variables of p such that \p'~\ = \p\o~. 
Hence p is not valid. 

Valid patterns are useful because they guarantee us that inaccessible patterns will 
always match whenever the accessible parts of the pattern match. This is made precise 
in lemma 3.10. 



Lemma 3.10 (Respectful patterns). Suppose T\A h p : $ pattern. Then p is a 
valid list of patterns if and only z/ match (p, t) =>- a implies that \p~\a = i (i.e. i 




Proof. Suppose p is valid and match (p,t) =>- o. Then t matches all accessible parts of p. 
This means that there exists a pattern p' with the same skeleton as p such that \p'~\ = i. 
Remark that p' has no pattern variables, only constructors and inaccessible patterns. 
Because p is a valid pattern, there exists a substitution a' such that \p] a' = |~p'] = i. By 
definition of match, we must have that o and a 1 are equal on all pattern variables of p. 
But the domain of both o and a 1 is exactly the set of pattern variables of p by definition, 
hence a = a' . We can conclude that \p]cr = i. 

Conversely, suppose MATCH(p, t) =^ a implies that \p~\cr = i. Let p' be any pattern 
with the same skeleton as p. Then we have MATCH (p, |~p']) =^ a for some substitution a 
satisfying |~p] a = \p'~\ . So we can conclude that p is a valid pattern. □ 

3.3.3 Required properties 

Just as for simple pattern matching, we still need to assure that definitions by pattern 
matching don't cause anything bad to happen. Here are the conditions in their complete 
form: 

Definition 3.11 (Completeness). A set of lists of patterns T|Aj h pj : $ pattern 

for i — 1, . . . , n is complete if for each list of closed terms t : $ there exists an index 
% such that t matches pi. 

In other words, if the patterns of a set of clauses C are complete, then the evaluation 
rule — >c is complete. 

Definition 3.12 (Termination). A set of clauses C is terminating if the evaluation 
rule — >c is strongly normalizing. 

Definition 3.13 (Confluence). A set of clauses C is confluent if the evaluation 
rule — >c h as the Church-Rosser property. 



Definition 3.14 (Definition by pattern matching). Let C be a set of clauses for 
f : $ — > T in a context T such that: 

• All patterns in the clauses of C are valid. 

• The patterns of the clauses in C are complete. 
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• The set of clauses C is terminating. 

• The set of clauses C is confluent. 

Then we can define the function f by adding the inference rule 

T valid 

T h T 

and for each clause f p = t in C the inference rule 

r h s : $ MATCH(p, s) =>- a 



T h / s = to- : T[<5> ^ s] 

From now on, we will denote — > for the evaluation rule containing all evaluation rules 
— >c for each function / defined by a set of clauses C. Remark that each clause of the 
definition holds as a definitional equality, unlike with the first-match semantics. 

Theorem 3.15. The evaluation rule — > is complete, strongly normalizing, and 
Church- Rosser. 

Proof. By construction. □ 



3.4 Pattern matching versus eliminators 

There is a general method to equip each inductive family with an eliminator, as we 
remarked in the previous chapter. This eliminator embodies the induction principle for 
that particular inductive family. For example, the eliminator for Nat is natrec and the 
eliminator for Bool is if ■ then ■ else ■? The general form of the eliminator of an 
inductive family is described in for example [Luo94] and [GMM06]. 

While eliminators are certainly sufficient to write all kinds of proofs and programs on 
inductive families, they are not very convenient. That is why we choose to use pattern 
matching instead to define those proofs and programs. This allows us to give those 
definitions in an easier and more readable way. 

Remark that each set of patterns defines a new induction principle. This means we 
have no need for the classic eliminators for inductive families, they can be defined using 
pattern matching. For example, the eliminator of the identity type J : (b : A) — > {p : a =a 
b) — >■ C a (ref 1 a) — >■ C b p can be defined by the single clause 

J [a\ (refl [a\) c = c (3.6) 

Be aware though that while pattern matching is more convenient than using the in- 
duction principles, it is also harder to understand theoretically. It is a deep and nontrivial 
result that under certain constraints, pattern matching can be written in function of the 
induction principles plus the so-called K axiom. This elimination of definitions by pattern 
matching is described in [McBOO] and [GMM06]. 



2 In fact, the real eliminators have a more general type than the examples given here. 



Chapter 4 



Checking pattern matching 
definitions 

In the previous chapter, we described dependent pattern matching and how we can use it 
to define (dependent) functions. We also gave a number of conditions that these definitions 
have to satisfy in order to be well-defined: pattern validity, completeness, termination, 
and confluence. However, the conditions were formulated in an abstract way that is not 
easily checked by a computer. In order to use pattern matching in implementations of 
type theory such as Agda, we need more concrete criteria that the definitions have to 
satisfy. In this chapter, we will describe the algorithms used in Agda as described in 
[Nor07]. We will also describe the representation of pattern matching definitions by case 
trees, which are used to evaluate functions efficiently. The nice thing is that we get this 
representation 'for free' when we check completeness. 

4.1 Checking inaccessible patterns 

Inaccessible patterns are introduced by constraints on the types of the constructors in 
the pattern. For example, the inaccessible pattern \m 2 \ in the definition of root (given 
in section 3.2) was introduced because the type of sq m is Square m 2 . So when type 
checking the definition of root, we are faced with the following problem: under what 
conditions can use a pattern of the form c X\ ... x n of type D i and which constraints 
do the indices % have to satisfy? 

If T is a simple inductive type like Nat or Bool without indices, then the answer is 
easy: any constructor c of the type T can be used, and no extra constraints are needed. 
If T is part of an inductive family, the situation is more complex. For example, suppose 
T = x < y for concrete values of x, y : Nat (see section 2.4.1 for the definition of the type 
x < y). There are now three possibilities for the set of constructors of T: 

• If x is strictly smaller than y, then the only valid constructor of x < y is 
smaller : (n : Nat)(m : Nat)(p : n < m) — > n < (sue m). This gives the con- 
straints that x = n and y = sue m. 

• If x and y are equal, then the only valid constructor of x < y is equal : (n : Nat) — > 
n < n. This gives the constraint that x = y = n. 

• If a; is bigger than y, then there are no valid constructors and hence x < y is empty. 
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This problem can be solved in general by unifying the indices of T with the indices of the 
type of each constructor. 

4.1.1 Unification 

Suppose we want to unify two terms a and b, i.e. we are interested in substitutions 5 such 
that aS = bS. Such a substitution is called a unifier of a and b. A most general unifier of 
a and b is a unifier 8 such that for each other unifier 5', there exists a substitution a such 
that 5' = 5; a. 

The question whether unifiers exist is called the unification problem. In general, this is 
an undecidable problem. There exist unification algorithms that search for a most general 
unifier (we will give one below) but they can give up in case the unification problem is too 
hard. We say that a unification algorithm succeeds positively if it finds a most general 
unifier, that it succeeds negatively if it concludes that there exist no unifiers, and that it 
fails if it gives up. 

Returning to the previous example, to know whether smaller is a constructor of the 
type x < y, we have to unify the types n < (sue to) and x < y. The unification algorithm 
will succeed positively with result [x i— >■ n, y i— >■ sue m]. 

Remark that a most general unifier of two terms is not necessarily unique: in the 
example, [n !->■ x, y (->■ sue to] is also a most general unifier. In a situation like this, 
the unification algorithm has to make an arbitrary choice between the two unifiers. To 
avoid this problem, we will supply the algorithm with a set of variables called the flexible 
variables. The flexible variables are the variables that can be used for the unification, 
i.e. that can occur in the domain of the substitution. If we let x and y be the flexible 
variables in the example, then the unification algorithm will always return the substitution 

[ll->?i,!/4 SUC TO,]. 

The unification algorithm is described in figure 4.1. This algorithm is the same as the 
one used in [Nor07]. The algorithm has as input two terms a and b and a set of flexible 
variables £. We write unify(w, v) =>- a if unification of u and v succeeds positively and 
UNiFY(w, v) fT if unification succeeds negatively. If we have neither unify(m, v ) =>• a nor 

UNlFY(w,t>) ff, then the algorithm fails and we write UNIFy(m, v) =>. 

4.1.2 Specialization steps 

Now we know how to do unification, we can use it to build valid patterns. 

Definition 4.1 (Specialization step). Suppose F\A h p : $ pattern and let 

(x : D i) G A be a pattern variable of type D i where D is an inductive family. 
Let c : 0 — >■ D j be a constructor of D. 



• If unify(z, j) 5 with flexible variables AO, then we write p =^ x c p' where 
p' — p([x !->■ c 0]; \_S\). Here [S\ of a substitution 8 — [x± \-> t±, . . . , x n (->■ t n ] is 
defined as [5\ = \x\ h-> , • • • , x n \-> [t n \]. 




Remark that p =^ p' implies that p 3 p', so specialization steps give us specialized 
patterns. The reason why we define specialization steps is because we can use them to 
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x G £ x g FV{v) 



unify(x,w) 


=4> 


[a; i— > v] 


x G £ x 


G 


FV(v) 


unify(x 


,c 


v) it 


unify(m, 


«) 


=4> a 



unify(c u, c v) =>- a 



unify(m, v) ft 
unify(c u,cv)f 

gl 7^ C 2 

unify(ci -u, c 2 ft 

r h ■» = v : T 
unify(m, v) =>- [] 

UNIFY(u, v) a 
unify(v,m) =4> <7 



UNIFY(e, e) =>- [] 

UNIFY(-u,w) =4> a UNlFY(ua,va) =>- cr' 
UNIFY(m u, V v) =>- a; a' 

unify(u, v) ft 
unify(m ft 

UNIFY(-u,-u) =4> a UNIFY (iZ<7, w) ft 
UNIFY(m -u) ff 



Figure 4.1: The unification algorithm. The second rule states that unification succeeds 
negatively when a variable occurs cyclically, what would otherwise result in an infinitely 
large term. 
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build valid patterns. This is expressed by lemma 4.2. 

Lemma 4.2. If p is a valid pattern of type $ and p p' , then p' is also a valid 
pattern of type $ . 

Proof. This follows directly from the proof of lemma 11 of [GMM06] and lemma 3.10. □ 

Definition 4.3 (Specialization sequence). If there is a sequence p 0 V\ =^c 2 2 
• • • Pn °f length n > 0, then we write p 0 p n . 

Corollary 4.4 (Construction of valid patterns). Let $ be a valid context. If 
then p iS a vaUd pattern of type®. 

Now we know how to build valid patterns constructor by constructor. It is also possible 
to check whether a given pattern is valid by attempting to reconstruct it in this way. This 
is described in sections 2.1.6 and 2.1.7 of [Nor07]. 

4.2 Checking completeness with coverings 

In order to check completeness, we have to make sure that at each position in the pattern, 
all possible constructors are considered. For example, to define a function / : Nat — > T, 
there should be two clauses / zero = . . . and / (sue n) = . . . corresponding to the two 
constructors zero and sue. We say that the trivial pattern m : Nat splits on m into the 
two patterns zero and sue n. We can split the pattern sue n again on n to get the 
two patterns sue zero and sue (sue k). By repeatedly splitting patterns on a pattern 
variable, we get a covering. For example, we just showed that 

{zero, sue zero, sue (sue k)} 

is a covering. Since each constructor is considered each time we split a pattern, a covering 
will always be complete. We will prove this in theorem 4.8. 

As in the previous section, we need to be careful if the type of a pattern variable is 
part of an inductive family. Luckily, we can reuse the unification algorithm to check what 
constructors can be used. This leads to the definition 4.5. 

Definition 4.5 (Splitting/Direct covering). Suppose T\A h p : $ pattern and 

x : T G A where T is a type from an inductive family with constructors Ci, . . . , c n . 
Suppose furthermore that p =>•;?. pi or p for all i — 1, . . . , n. Then we call the set 
{pi | p =^. pi} a splitting or a direct covering of p. 

Lemma 4.6 tells us that splitting a pattern doesn't change which closed terms match 
the patterns. This is the essential step in showing that a covering is complete. 

Lemma 4.6. If {pi, . . . ,p n } is a direct covering ofT\A h p : $ pattern and i : $ is 

a closed term matching p, then i matches one of the Pi . 

Proof. Suppose t : $ is closed and \p~\cr = t for some substitution a with domain A. By 
definition of a direct covering, there is a variable (x : D u) e A and constructors Ci, . . . , c n 
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such that p pi for i — 1, . . . , n. Let s = xa, then s is also a closed term because 
i is. By strong normalization and completeness, we have that s — >* c S\ ... Sk for 
some constructor c of the inductive family D. Let c : f D « be the type of c and 
r = [\& (->■ s x ... Sfc] . Remark that we can decompose a as [i4c$];t; a' where x is not 
in the domain of a'. 

By type-correctness of t — \p] [x \-¥ c ^]ra' : $, we must have that r; cr' unifies the 
types D u and D v of rr and c \I/ respectively. This implies that the unification of u 
and v does not succeed negatively, and it also does not fail by definition of a direct 
covering. So we must have that the unification succeeds positively, i.e. p p' for 
p' = p[x i — y c [5J where 5 is a most general unifier of u and v. But r; cr' was also 
a unifier of -u and v, so there must exist some p such that r; cr' = 5; p. Then we have 
t— \p\ [x c ^}t<j' = \p] [x^c ^]5p = \p']p. 

Again by definition of a direct covering, p' must equal pi and c must equal q for some 
i between 1 and n. So we have t = \pi\p, as we wanted to prove. □ 

Definition 4.7 (Covering). The coverings of p are defined as follows: 

1. The set {p} is a covering of p. 

2. If {pi, . . . ,p n } is a direct covering ofp and Oi is a covering of pi fori = 1, . . . , n, 
then UILi ®i ^ s a covering of p. 

A covering is a covering of the trivial pattern (consisting of only pattern variables). 

Theorem 4.8 follows directly from lemma 4.6. 

Theorem 4.8. A covering is always complete. 

We know that a set of patterns is complete if it forms a covering, but how do we 
recognize coverings? In general, this problem is known to be undecidable [GMM06]. The 
problem is the following: an empty covering can be built by an arbitrarily large number of 
splittings, if all branches lead to a negative unification. So when reconstructing a covering, 
we cannot know when to stop splitting. The answer to this problem used in [GMM06] 
and [Nor07] is to require explicit refutations in the pattern if p i\ x c for all applicable 
constructors c. So we will extend the syntax of our patterns by the absurd pattern 0 as 
follows: 

Vc : pf* 
p p[ x i y 0] 

A clause with an absurd pattern is only used to make the job of the completeness checker 
easier; it will never match a closed term and hence it does not need a right-hand side. 

T|A hp: $ pattern p p' 
r h / p' clause 

Using these absurd patterns, it again becomes decidable whether a given set of patterns 
forms a covering [GMM06]. 
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4.3 Checking termination with the structural order 

In order to check termination, we have to put a condition on the arguments of the recursive 
calls in the right-hand side. This should prevent clauses such as 

/ (sue n) = f (sue (sue n)) 

which lead to non-termination. To do this, it is required that the arguments of the right- 
hand side recursive calls are built from a strictly smaller number of constructors than the 
left-hand side pattern. If this is the case, then we call the recursive arguments structurally 
smaller. For example, n is structurally smaller than sue n, but sue (sue n) is not. Hence 
the clause / (sue n) — f n is allowed, but / (sue n) = f (sue (sue n)) is not. 

It is possible that a constructor of an inductive type T has an argument of a type of 
the form A — > T where A is a nonempty context. The strictly positiveness requirement 
ensures us that T does not occur in A. For example, we can define a type of infinitely 
branching trees Inf Tree with two constructors 

leaf : InfTree 

branch : (Nat — > InfTree) — > InfTree 

The tree branch h has infinitely many direct subtrees, namely one subtree h n for each 
n : Nat. While defining a function / : InfTree — > B, we have that h : Nat — > InfTree 
is structurally smaller than branch h. However, this doesn't allow us to make recursive 
calls of / on the direct subtrees h n. To solve this problem, a function h is defined to be 
structurally as large as its largest value h n. Formally, this means that if h is structurally 
smaller than t, then h n is also structurally smaller than t for any n. For example, this 
allows us to define a function / : InfTree — > Nat as follows: 

/ leaf = zero 

/ (branch h) = sue (/ (h zero)) 

We are now ready to give the definition of the structural order: 

Definition 4.9 (The structural order). The structural order -< between terms is 
inductively defined by the following rules: 

L^l r -< s s -< t 

U~<ct x ... t n f s -<t ' r ^t 



This is the same structural order as the one used in [GMM06] . 

It is nice to know that the arguments of the function become smaller with each recursive 
call, but this doesn't guarantee termination by itself. Indeed, if there would exist an 
infinite sequence of terms, each smaller than the last, then evaluation could still go on 
forever. So we have to check whether there exist any such infinite sequences. If there are 
no such sequences for a certain order <, then the order is called well-founded. Theorem 
4.10 tells us that this requirement is satisfied for the structural order. 

Theorem 4.10. The structural order -< is well-founded. 
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Proof. A proof of theorem 4.10 is given for a version of A-calculus with simple types and 
pattern matching in [AA02]. Since termination is a property of terms and not of their 
types, one can expect that it is not too hard to adapt the proof to a dependently typed 
context. In fact, [GMM06] also contains a very indirect proof for the case with dependent 
types. We will not give the proof here because termination is not the focus of this thesis. □ 

We are now almost ready to give the criterion for termination. The only case we 
haven't considered yet, is for functions with multiple arguments. It is not sufficient to 
require that each clause separately reduces the size of one of the arguments. Consider for 
example the following two clauses of a function / : Nat — > Nat — > Nat: 

/ zero zero = zero 

/ (sue to) zero = /to (sue zero) 

/ to (sue n) = f (sue (sue to)) n 

Then we have 

/ (sue to) zero — > f to (sue zero) — > f (sue (sue to)) zero — > . . . 

which will go on infinitely. To avoid this problem, it is required that there is a number k 
such that each clause reduces the size of the fc'th argument. This leads to the full criterion 
for termination given in theorem 4.11. 

Corollary 4.11. Let C be a set of clauses for the function f : A — >■ T with n patterns 
each and let 1 < k < n. If for each clause f p = t in C and all recursive calls f s in 



For the proof, we first need the following easy lemma. 
Lemma 4.12. If s <t, then scr -< to for any substitution o. 

Proof of the lemma. By induction on the definition of -<. □ 

Proof of corollary 1^.11. Let C be a set of clauses defining the function /. Consider the 
following property Tf(s) of a list of terms s of length n: 

"If s has type A, then evaluation of / s always reaches a normal form after a 
finite number of evaluation steps of — >c" 

To prove that the clauses C are terminating, we have to prove that Tj(s) holds for any 
s. By the principle of well-founded induction, it is sufficient to prove that / s reaches 
a normal form from the assumption that / s' reaches a normal form for any s' : A with 
s' k -< Sfc. But all recursive calls of / in the right-hand side of the clauses of C satisfy this 
condition, by assumption of the theorem. By the lemma, this stays true when we apply 
the function to a concrete argument. Hence we can conclude that the set of clauses C is 
terminating. □ 

The current implementation of termination checking in Agda is more complicated than 
described in this section. For instance, it uses size-change termination [LJB01] and sup- 
ports corecursive definitions. A good overview of the modern approaches to termination 
can be found in [VCW12]. 1 



1 Thanks to Thorsten Altenkirch, Nils Anders Daniclsson, and James McKinna for pointing me to the 
references in this section. 
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4.4 Case trees 

As far as defining functions goes, it is fine to represent a definition by a set of clauses. But 
if we want to evaluate the function or reason about it, it is better to have a representation 
that gives more structure. Hence definitions by pattern matching are often represented as 
case trees. A case tree tells us how the patterns of the definition are built by introducing 
constructors step by step. Each leaf of a case tree for a function / corresponds to a clause 
for /. 

Definition 4.13 (Case tree). A case tree for a function f : A — > T with label 
p : A pattern is one of the following: 

• Either it is an internal node, and the subtrees are again case trees for f such 
that the labels of the subtrees form a direct covering of p, 

• or it is a leaf node containing a terra t : T[x H- \p~\] called the right-hand side, 

• or p contains an absurd pattern 0, in this case the tree is a leaf node with no 
right-hand side. 

A case tree for f is a case tree with as its root label the trivial pattern A consisting 
mlu of pattern variables. 



Here is an example of a case tree for the function plus : Nat — > Nat — > Nat: 



n m 

zero m h-> m 

(sue n) m k-> sue (plus n m) 



n : Nat 



There are a few advantages to using case trees instead of plain sets of clauses. Firstly, 
they give an efficient method to evaluate functions defined by pattern matching. Secondly, 
the patterns at the leaves of a case tree always form a covering, hence they are complete. 
Thirdly, each node in a case tree corresponds exactly to the application of an eliminator for 
an inductive family, so they are a useful intermediate step in the translation of dependent 
pattern matching to pure type theory (without pattern matching) as done in [GMM06]. 

However, case trees are not sufficient to represent all pattern matching definitions. 
This problem will become even more apparent when we consider overlapping patterns, 
since these cannot be represented by case trees at all. 



4.4.1 Evaluating a function given by a case tree 

If we have a case tree for a function / : A — > T, then we can use it to evaluate / t 
efficiently without considering all clauses of / separately. This is done by matching the 
arguments of / against the labels of the case tree. Because the root of a case tree is 
labeled by the trivial pattern, we know that any arguments will always match the label of 
the root. We will then recursively look for a subtree such that the arguments match the 
label of that subtree until we arrive at a leaf node. Finding the correct subtree is done 
by matching the arguments against the label of each subtree. 

If the arguments are closed terms, then lemma 4.6 guarantees that at least one subtree 
will give a match. However, if the arguments contain free variables then it is not always 
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possible to find a matching subtree. In this case, evaluation gets stuck until the free 
variable is instantiated. 

4.4.2 Building a case tree from a set of clauses 

In this section, we will give an algorithm to construct a case tree from a given (complete) 
set of clauses. This algorithm is adapted from [Nor07]. The algorithm does not require 
the input patterns to form a covering, so the patterns of the case tree will not be the 
same as the input patterns. It will however try to build a covering that is as close to 
the original definition as possible. In particular, when dealing with overlapping patterns, 
the algorithm will choose whatever pattern comes first. In other words, the resulting case 
tree follows the first-match semantics of pattern matching. 

The algorithm starts with an ordered set of clauses C for a function / : A — > T, and 
its goal is to construct a case tree with label A. The algorithm works by recursively 
constructing a case tree with label q, while maintaining the invariant that C is nonempty 
and q D ■p i for each clause f pi — U in C, i.e. there exist substitutions 5i such that 

\Pi\ 

If there is a clause / pi — ti in C such that we have pi D q, then we have reached a 
leaf node. The right-hand side is tjcr, where a is the substitution such that q = pi<j. It 
might be that there are multiple clauses for which pi D q, in this case the first clause is 
chosen. 

Otherwise we choose a pattern variable x in q such that there is a direct covering 
{Qj I Q QjiJ = 1, • • • , of q. For each j from 1 to A;, we will recursively build a case 
tree with label q~j. To do this, we will divide C into a number of disjoint sets: one set Cj 
for each constructor Cj and two additional sets C var and C rest . 

Cj = {/ Pi — ti | xSi is of the form Cj r\ . . . r{\ 
C var = {/ Pi — U\ x$i is a variable} (4.1) 
Crest = {/ Pi = U \ x5i is an inaccessible pattern} 

For each constructor Cj, we also define a set of specialized clauses Cj as follows: 

Cj = {/ Pio = Ua | p, =^f 1 Pid, (f pi = he C var } (4.2) 

The subtree with label q~j is now recursively constructed using the clauses CjUCj. Remark 
that the original clauses in C var and C res t are thrown away; they are not used in the further 
construction of the case tree. 

Remark that we have not yet specified how to choose the variable x. In order to make 
progress, we should always choose a blocking variable: this is a pattern variable x of q 
such that the corresponding pattern x5i in at least one pi is a constructor pattern. There 
may be multiple blocking variables; in this situation we have to choose one. The problem 
is that not all choices give us a complete case tree and even if two different choices both 
give a case tree, they will not always be equivalent. The consequences of this choice and 
a possible solution will be discussed in the next chapter. 



Chapter 5 



Overlapping and Order-independent 
Patterns 

In this chapter, we will present the motivations for the extension of pattern matching 
discussed in the rest of this thesis. We will first describe the two design goals that make 
dependent pattern matching as it currently is: the need to write definitions in function 
of eliminators, and the first-match semantics for overlapping patterns. These two forces 
have lead to representing functions by case trees as in the last chapter. However, this 
restriction leads to some problems that we will discuss. At the end of this chapter, we 
will describe how these problems can be solved by allowing definitions with more general 
pattern sets. 

Remark that the problems we try to fix are not just problems with the current ap- 
proach, but consequences of the two design goals. So in order to fix the problems, we will 
have to give up on both goals. We argue that this is worth it from a practical perspective 
because it leads to definitions that are both precise and easy to understand. 

5.1 Two design goals of pattern matching 

When we want to use dependent pattern matching, a conflict between theory and practice 
presents itself. On the one hand, type theorists want definitions to be represented by case 
trees and to be structurally recursive, because functions defined in this way can be written 
in function of eliminators [GMM06]. This guarantees that these functions will not break 
important properties of type theory such as strong normalization or the Church-Rosser 
property. 

On the other hand, functional programmers are used to writing definitions with over- 
lapping patterns that follow the first-match semantics, because this reduces the number 
of clauses required in some cases. Hence they might think it is very restrictive and cum- 
bersome to allow only coverings as the patterns of a function. The consequence of using 
first-match semantics is that clauses only hold as definitional equalities when all the pre- 
vious clauses have failed to match the arguments. Hence the meaning of a clause depends 
on all the clauses above it, making it harder to understand. 

In attempt to reconcile these two goals, Agda [Nor07] allows patterns to overlap but 
translates definitions internally to a case tree as described in the previous chapter. While 
building this case tree, first-match semantics are used to choose between overlapping 
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clauses. Internally, Agda will continue to use this case tree to represent the function, not 
the original definition. 



5.2 Problem statement 

The differences between the original definition and the case tree lead to three problems: 
the first-match semantics don't capture the true meaning of overlapping patterns, clauses 
don't hold as definitional equalities, and the semantics of a definition depend on the order 
of the clauses. 



5.2.1 First-match semantics for overlapping patterns 

If the patterns in a definition overlap, then the first-match semantics dictate that the 
first matching clause is always used. In the translation to a case tree, this is done by 
splitting the clauses and then throwing away some of the results. For example, consider 
the following (somewhat artificial) definition of plus : Nat — > Nat — > Nat: 

plus zero y = y 

plus (sue x) zero = sue x (5.1) 
plus x (sue y) = sue (plus x y) 

To build a case tree, we start in the root node with label x y where x,y : Nat. The first 
step is to split this pattern on the variable x. This will specialize the third clause into 
two clauses: 

plus zero (sue y) = sue (plus zero y) , , 

plus (sue x) (sue y) = sue (plus (sue x) y) 

The resulting case tree will look as follows: 

x y 



x : Nat 



zero y h-> y 
(sue x) y 

(sue x) zero h-> sue x 
(sue x) (sue y) i— > sue (plus (sue x) y) 



y : Nat 



The first of the new clauses is not necessary to build the case tree, so it is discarded. 
Is this case tree still equivalent to the original definition when we discard this clause? 
Suppose we want to evaluate plus x (sue y) where x and y are free variables. Then the 
splitting tree contains no pattern matching these arguments, while the original definition 
did. So if we want to keep as many reduction rules as possible, then the answer is no: 
the case tree is not equivalent. Indeed, even if the patterns form a covering matching can 
still fail if the arguments contain free variables. So it can be useful to keep overlapping 
patterns. 

As another example, let's look at the following (more standard) definition of plus: 

plus zero y = y 

plus (sue x) y = sue (plus x y) 
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Then there are no possible evaluation steps of the term plus x (sue zero), even though 
we would like to reduce it to sue x. We can add the following two clauses 

plus x zero — x , 
plus x (sue y) = sue (plus x y) 

that allow us to conclude plus x (sue zero) — > sue (plus x zero) — > sue x. In practice, 
it can sometimes be very useful to treat overlapping patterns in this way instead of using 
the first-match semantics. 



5.2.2 Definitional equalities are lost 

Since not all complete function definitions have an equivalent case tree, sometimes clauses 
have to be split while building the case tree. This is even the case when the patterns of 
the definition do not overlap. This is illustrated by the following definition of the function 
majority : Bool — > Bool — > Bool — > Bool due to Gerard Berry: 



(5.5) 



The patterns of this definition are complete: for all concrete values of the booleans x, y, z 
there is a clause that defines maj ority x y z. However, they do not form a covering. When 
creating a case tree from this definition, the second and third clauses are specialized into 
two clauses each. This results in the following case tree: 



majority 


true 


true 


true 


= true 


majority 


X 


false 


true 


= X 


majority 


true 


y 


false 


= y 


majority 


false 


true 


z 


= z 


majority 


false 


false 


false 


= false 



x y z 



true y z 



x : Bool < 




true true z 



z : Bool 



true false z 



z : Bool 



true true true h-> true 
true true false h-> true 

true false true i— > true 
true false false i— > false 



false true z i— > z 
false false z 

false false true i— >■ false 
false false false i— >■ false 



z : Bool 



On first sight, this case tree seems to be equivalent with the original definition. However, 
its evaluation rules are slightly different. If x : Bool is a free variable, then according to the 
original definition majority x false true evaluates to x. According to the case tree, this 
is no longer true because the clause has been split into majority true false true = true 
and majority false false true = false. So we have lost the definitional equality 
majority x false true = x. 
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Remark that the equality still holds in both cases x = true and x = false, hence it 
is possible to construct a term (x : Nat) h p : majority x false true =]\f a t x by pattern 
matching on x. But this tells us only that the terms are propositionally equal, which is a 
weaker kind of equality than the definitional equality. 



5.2.3 Order of the patterns matters 

Even when the patterns of a function do not overlap and actually form a covering, the 
order of the clauses can still influence the meaning of that function when it is translated 
to a case tree. For example, consider the following function p : Nat — > Nat — > Bool which 
is a simplified version of an example given by Achim Jung on the Agda mailing list. It 
checks whether two given natural numbers are equal modulo 2: 



(5.6) 



Remark that there are again no overlapping patterns in this definition. In fact, the 
patterns even form a covering in this case. Here is a case tree that corresponds exactly 
to the function p: 



p 


zero 


zero 


= true 


p 


zero 


(sue zero) 


= false 


p 


zero 


(sue (sue y)) 


= p zero y 


p 


(sue zero) 


■zero 


= false 


p 


(sue (sue x)) 


■zero 


= pi zero 


p 


(sue x) 


(sue y) 


= P x y 



n m 



n : Nat < 



zero m 



m : Nat 



'sue n) m 



m : Nat 



zero zero i— > true 
zero (sue m) 

zero (sue zero) h-> false 
zero (sue (sue m) h-> p zero m 



m : Nat 



'sue n) zero 

(sue zero) zero i— >■ false 
(sue (sue n)) zero 4pn zero 
(sue n) (sue m) p n m 



n : Nat 



However, when building a case tree from the above definition in the way described in the 
previous chapter, the result will be a different case tree. Indeed, the last pattern is split 
in the following two clauses: 

p (sue zero) (sue y) = p zero y 
p (sue (sue x)) (sue y) = p (sue x) y 

If the last clause was placed first instead, then the correct covering would have been 
reconstructed. One possible solution for this specific problem would be to build all possible 
case trees and choose the one that gives the least number of extra splittings, but this could 
become very time-consuming for long definitions. 
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5.3 Allowing overlapping patterns 

In order to be able to reduce definitions by pattern matching to eliminators, they are 
converted to case trees. During this conversion, first-match semantics are used. We have 
encountered the following problems caused by this conversion: 

• First-match semantics cause overlapping patterns to be discarded, even in situations 
where it could be useful to keep them. 

• Clauses must be split to create a covering, changing the reduction rules for ar- 
guments with free variables. What clauses are split depends on the order of the 
clauses. 

• Clauses can even be split when the patterns already form a covering, depending on 
the order of the clauses. This results in a different covering than the one we started 
from. 

All these problems come down to the fact that not all clauses are preserved as definitional 
equalities in the translation to a case tree. In Agda, this means that the clauses given by 
the programmer do not correspond to the clauses used internally, which is annoying and 
can lead to unexpected results. 

To fix these problems, we will extend pattern matching in order to allow more general 
pattern sets than just coverings. In particular, the patterns in a definition will be allowed 
to overlap. Instead of following the first-match semantics, these definitions follow 'any- 
match semantics', i.e. any clause can be used to evaluate the function at any time. In 
practice, this means that evaluation of a function application doesn't block when it can't 
be decided whether a pattern matches the arguments or not. Instead, all patterns are 
considered for pattern matching in parallel. If one of the patterns matches the arguments, 
evaluation continues; even if another pattern blocks. This gives us 'what-you-see-is-what- 
you-get' pattern matching where all clauses hold as definitional equalities. 

By extending pattern matching in this way, we solve the three above problems. Over- 
lapping patterns are allowed, hence there is no need to discard them. We don't need 
patterns to form a covering, so there is no need to split clauses. We don't have to trans- 
late definitions to case trees, so the meaning of a definition does not depend on order of the 
patterns. The result is that definitions by pattern matching feel more like mathematical 
definitions, rather than program instructions. 

Our approach also has some drawbacks: 

• First of all, we lose the ability to translate definitions to pure type theory with 
eliminators. To guarantee correctness (completeness, termination, and confluence), 
we will thus need to reason about the definitions directly. The criterion for termi- 
nation we gave in the previous chapter doesn't depend on the fact that the patterns 
form a covering, but the criteria for completeness and confluence do. In the next 
chapter, we will describe how to check completeness and confluence in the presence 
of overlapping patterns. 

• We also lose the first-match semantics. This doesn't restrict the definitions we 
can give, but it requires us to write longer definitions in some cases. This is an 
unavoidable problem because we want clauses to be order- independent. 
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• Finally, we lose the ability to represent functions by case trees, hence the ability to 
evaluate functions efficiently. In the next chapter, we will extend case trees with 
catchall subtrees which allows us to represent these more general definitions by case 
trees. 



Chapter 6 



Checking definitions with 
overlapping patterns 

In the last chapter, we proposed a generalized form of pattern matching with dependent 
types. However, the standard techniques are not sufficient to check correctness of these 
generalized definitions. In particular, the criterion for completeness has to be adapted 
and we also need a criterion for confluence. We will describe both in this chapter. We will 
also present how these generalized definitions can be represented by case trees by adding 
catchall subtrees. 

6.1 Checking completeness 

We want to allow more general pattern sets than just coverings, but at the same time, 
we still want to be sure that the pattern set is complete in the sense of definition 3.11. 
So far, coverings are the only pattern sets we know that are guaranteed to be complete. 
The idea is that we start from a covering and then change it step by step while preserving 
completeness. More concretely, we can make the following changes to a set of patterns 
without losing completeness: 

• We can add new patterns. 

• We can replace a pattern p by a more general pattern p' such that p' D p, i.e. all 
terms that match p also match p'. 

• We can contract a number of patterns p±, . . . ,p n to a pattern p where p D pi for 
i — 1, . . . , n. 

Starting from a covering and repeating these three steps, we can build a large number of 
overlapping pattern sets; in particular all pattern sets of the examples from the previous 
chapter can be constructed in this way. 

These three rules can be nicely summarized in one criterion: 

Theorem 6.1. Let P be a set of lists of patterns of the same type. If there exists a 
covering O such that for each q e O, there exists a p G P such that p D q, then P is 
complete. 
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Proof. Since the covering O is complete by theorem 4.8, each closed term t matches a 
q G O, i.e. there exists a substitution a such that i = \q~\cr. By assumption, there exists 
a p such that p D q, i.e. there exists a substitution 5 such that \p]5 = \q~\ . Then we have 
t = \p] 5a, hence the set of patterns P is complete. □ 

This leaves us with the question of how to find this covering O. But in fact, we already 
know how: by building a case tree from the patterns in P. If we succeed in building a case 
tree, then the patterns at the leaves of the case tree form a covering O. We claim that 
this covering satisfies the conditions of the theorem. Indeed, while building the case tree 
we only split patterns in P or we discard overlapping patterns. Hence for each pattern q 
in the resulting covering O, there exists a pattern p in P such that p D q, exactly as we 
wanted. 

6.2 Checking confluence 

To check the confluence of a definition, we have to check for each pair of clauses that 
whenever a term matches both their patterns, then they give the same result for that 
term. If there are no terms matching both patterns, then this is trivially satisfied. So 
the first thing we need to check is whether two patterns overlap, i.e. whether there is a 
term that matches both patterns. We make the following observation: let pi and p 2 be 
two patterns that have a most general unifier a, so that \pi\o~ = \p] = \p2~\cr- Then a 
term i matches p if and only if it matches both p 1 and p 2 . Also, if there is no unifier of 
pi and p 2 , then there is no term t that matches both pi and p 2 . So if we require that the 
unification of each pair of patterns (with all pattern variables as the flexible variables) 
succeeds (either positively or negatively) then we are able to check whether two patterns 
overlap. 

Now suppose we have found two clauses / pi — ti and / p 2 = t 2 that overlap, i.e. 
\pi\o~ = \p 2 ~\o- for some most general unifier a. In order to obtain confluence, we require 
that the right-hand sides ti and t 2 are equal under this substitution a. But what kind of 
equality do we want? Let's look at the following two overlapping clauses of the function 
plus: 

plus (sue Xi) yi = sue (plus x 1 yi) 

plus x 2 (sue y 2 ) = sue (plus x 2 y 2 ) 

We have numbered the variables in order to avoid confusion. The most general unifier of 
the patterns is a = [x 2 i— > sue xi, yi h- >■ sue y 2 \. If we apply this substitution to the right- 
hand sides, we get sue (plus x\ (sue y 2 )) for the first clause and sue (plus (sue x\) y 2 ) 
for the second clause. These are certainly not syntactically equal, but they do evaluate to 
the same normal form sue (sue (plus x\ y 2 )). Hence they are definitionally equal. This 
is the criterion we will use for checking confluence. In section 7.2, we will see that it is 
not sufficient to ask that t\a and t 2 a are only propositionally equal. 

Theorem 6.2. Let C be a set of clauses for the function /:$—)• T. If for each 
pair of clauses f p 1 — t 1 and f p 2 = t 2 we have that unification of pi and p 2 with all 
pattern variables as flexible variables 

• either succeeds positively with result a and t\o is definitionally equal to t 2 a, 
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then the clauses C are confluent. 

Proof. Let u : $ and suppose / u — >c v i an d / u — v 2- By definition of the evaluation 
rule — >c, there exist clauses / pi — t±, f p 2 — in C and substitutions 8±, 82 such that 
[pil^i — u — \p 2 ~\8 2 , t\8i = vi and t 2 8 2 = v 2 . In particular we have that 5 = Si; 62 
is a unifier of pi and p2, so unification of pi and p 2 cannot succeed negatively. By the 
assumption of the theorem, unification of pi and p2 must succeed positively with result 
a and there must exist a term t such that tiO — >* c t and t 2 o — >* c t. Because o is 
a most general unifier, there exists a substitution 8' such that 8 — a; 8' . This implies 
vi = ti8 = (tia)S' — >* c t8' and v 2 = t 2 8 = (t2cr)8' — >* c t8', hence the set of clauses C is 
confluent. □ 

Now how can this criterion be used to implement a confluence checker? Suppose we 
have a function /:<£>—> T and two clauses f pi — h and / p 2 = t 2 where T\Ai h 
pi : $ pattern and T\A 2 \- p 2 : & pattern. To check confluence of these clauses, the 
algorithm proceeds as follows: 

First, we attempt to unify pi and p 2 . If the unification succeeds negatively, then the 
clauses are confluent. If the unification succeeds positively with result a, then we apply 
a to ti and t 2 . We proceed by evaluating tiO and t 2 o~ to normal form. If the results are 
syntactically equal, then the clauses are confluent. The algorithm is summarized in figure 
6.1. 

UNlFY(["pil, \p 2 ]) =>• a_ rAiA 2 h t x a = t 2 o : T[$ i-» [pi]]o- 
Confluent(/ pi—t 1 ,fp 2 — t 2 ) 

UNIFY([p!l, \p 2 ]) ft 

Confluent(/ pi—t 1 ,fp2 — t 2 ) 

Figure 6.1: The algorithm for checking confluence. Since a is a unifier of \p{] and \p2~\, 
we have T[$ h-> [pi]]cr = T[$ h-> [p2l]c; i-e. the terms we compare have equal type. 

Remark that in order to check confluence, we already need to evaluate the function 
for which we are checking confluence. Hence we can only check confluence after we have 
first checked termination of the definition. 

There are two ways in which the confluence check can fail: 

• The unification of the patterns can fail. Remark that unification of patterns con- 
sisting of only constructors and variables always succeeds (either positively or neg- 
atively). So the unification can only fail if we have to unify an inaccessible pattern 
with a constructor pattern or another inaccessible pattern. 

• The normal forms of t\o and t 2 o may not be syntactically equal. Either this is 
because the clauses are not confluent, or they are confluent but a simple syntactic 
check of the normal forms is insufficient to see this. 

In our experience, the first failure occurs more often than the second one. So in order 
to improve the algorithm, it would probably be more fruitful to improve the unification 
algorithm rather than the comparison of the right-hand sides. 
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6.3 Case trees for overlapping clauses 

By allowing patterns to overlap, we have lost the ability to represent function definitions 
as case trees. In this section, we will discuss how functions whose patterns do not form 
a covering can be represented in a similar way. This approach has been inspired by the 
paper of A. Graf of left-to-right tree pattern matching [Gr91]. 

To capture the extra reduction rules given by overlapping patterns, we will add a 
catchall subtree to each internal node of a case tree. This fallback subtree captures 
exactly the information from the user patterns that is lost in the translation to a case 
tree. During evaluation, a catchall subtree will give an alternative when the matching 
with the main case tree is inconclusive. 

A catchall subtree can be recognized by the fact that it is always labeled with the 
same patterns as its parent node. For example, here is a case tree with a catchall subtree 
for the overlapping definition of plus: 



Catchall subtrees are not required to be complete, i.e. the patterns of the children of a 
node are no longer required to form a covering. For example, here is a case tree with 
catchall subtrees that captures the exact semantics of the majority function 5.5. 



n m 



zero m i — y m 

(sue n) m (->■ sue (plus n m) 



n : Nat < 




n m 



n (sue m) i— > sue (plus n m) 



x y z 



true y z 



true true z 




true true true i— >■ true 
true true false i— > true 



y : Bool < 



true false z 



x : Bool < 




x false z 

z : Bool { x false true i— > x 



true y z 

z : Bool { true y false \-t y 



false true zn-z 
false false z 



true false true h-> true 
true false false i— > false 



false false true i— > false 
false false false >->■ false 



Remark also that while extra clauses have been added in this case tree, all original clauses 
are still present. The added clauses are merely specialized versions of the original clauses 
that are needed to get a complete case tree when we remove the catchall subtrees. 
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Definition 6.3 (Catchall tree). A catchall tree for a function f : A — >■ T with label 
p : A pattern is one of the following 

• Either it is an internal node and there is a pattern variable x of p such that the 
subtrees are again catchall trees with labels pi where p =^. pi for constructors Ci . 



• or it is a leaf node containing a term t : T[x i-> |~p~|] called the right-hand side. 

• or p contains an absurd pattern 0, in this case the tree is a leaf node with no 



A case tree with catchall subtrees is the same as a regular case tree, except that each 
internal node with label p is allowed to have an additional subtree that is a catchall tree 
with the same label p. Each internal node of a catchall subtree is also allowed to have an 
additional catchall subtree, and so on. 

Case trees with catchall subtrees can be built from a (complete) set of clauses in 
almost the same way as regular case trees. When building the subtrees, we will take care 
not to throw away any information about the reduction rules contained in the clauses. 
Remember that while building a case tree, the set of clauses was partitioned in sets Cj 
for each constructor Cj, and two additional sets C var and C rest . The clauses that are lost 
as definitional equalities in the translation to a case tree are exactly the ones in C var and 
C rest . Hence these clauses will be used to build the catchall subtree. Remark that the 
variable x we chose to build the covering is no longer a blocking variable for the clauses 
C var U C res t, so a different variable will have to be chosen in the catchall subtree. 

Let us give an example of how a case tree with catchall subtrees is built. Define a 

? 

data type x =a y with one parameter A : Set, two indices x, y : A and two constructors: 



eq : (x : A) — > x =a x 

7 

neq : (x : A)(y : A) ->■ x = A y 



(6.1) 



(6.2) 



Here is the definition of a function / : (y : Bool)(p : true =Bool v) ~* Bool: 

/ true p = true 

/ V (neq LtrueJ [y\ ) = y 

The first step in building a case tree is to split on the variable y : Bool. This partitions 
the clauses into the following sets: 

Ctrue = {/ true V = true} 
Cfalse = 0 

C V ar = {/ y (neq [true] [y\) = y} 

Crest — 0 

The case tree is built as normal, the only interesting part is that the clause in C var is used 
to build a catchall subtree. Here is the final result: 



yp 



true p i — y true 
false p 

y : Bool ^ p : true =Bool Ia -l se { false (neq [true] [falsej) i— > false 

yp 

. P ■ true =Bool y { y (neq LtrueJ \_y\) t-> y 



Chapter 7 
Evaluation 



In this chapter, we will give some examples of definitions with overlapping patterns. 
By defining functions with overlapping patterns, we get extra evaluation rules, which 
sometimes makes it easier to prove propositions that mention these functions. We will 
also give an example where our confluence check fails unexpectedly. We will discuss an 
alternative criterion for confluence that does accept this definition. However, we will argue 
that this alternative criterion is ultimately not enough to obtain the kind of confluence 
we need, hence we will stick to our original criterion. 



7.1 Some examples 

7.1.1 Addition of natural numbers 

We have mentioned it often enough, but here is the full definition of plus : Nat — > Nat — > 
Nat with overlapping patterns one more time: 

plus zero y 

plus (sue x) y 

plus x zero 

plus x (sue y) 

Remark that the reduction rules of this definition are maximal in some sense: when- 
ever there appears a constructor in any of the arguments, we can always immediately 
apply one of the clauses. We could add more clauses to the definition, for example 
plus (sue m) (sue n) = sue (sue (plus m n)), but this doesn't add any more power to 
the definition. Indeed, we can get the same evaluation rule by combining the second and 
the fourth clause. 

In section 6.2, we have shown how to check confluence of the second and the fourth 
clause. The other pairs can be checked in a similar way. 

Using this definition of plus can make it a lot easier to define new functions where 
natural numbers appear in the types. For example, we can readily define the function 
plus-comm : (m : Nat)(n : Nat) — > (plus m n) = Nat (plus n m) that proves the commu- 
tativity of plus as follows: 

plus-comm zero n = ref 1 n , . 

plus-comm (sue m) n = cong sue (plus-comm ran) 



= V 

= sue (plus x y) , ? ^ 

— x ^ ' ' 

= sue (plus x y) 
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Here, cong is a congruence rule that has type (/ : A — > B)(x : A)(y : A) — > x =a y — > 
f x =a f y- It can be defined by pattern matching as 

cong / [x\ [x\ (ref 1 x) = ref 1 (/ x) 

To give this proof for the non-overlapping definition of plus, the Agda standard library 
first needs a lemma to prove that 1 + (m + n) = m + (1 + n), and then the proof itself 
still takes approximately eight lines. 

7.1.2 Operations on booleans 

Analogously to the definition of plus, we can define operations and and or with overlap- 
ping patterns: 



and 


true 


y 


= y 


and 


false 


y 


= false 


and 


X 


true 


= X 


and 


X 


false 


= false 


or 


true 


y 


= true 


or 


false 


y 


= y 


or 


X 


true 


= true 


or 


X 


false 


= X 



(7.3) 
(7.4) 



Confluence can be checked very easily for these two definitions. 

7.1.3 Concatenation of vectors 

We can also give a definition of the concatenation concat of two vectors that has type 
concat : (m : Nat)(n : Nat) — > Vec Am — >■ Vec An — >■ Vec A (plus m n): 

concat L zero J n e w = w 

concat m |_ zero J v e = v (7-5) 

concat [ suc m \ n (cons m a v) w = cons m a (concat m n v w) 

The non-standard clause in this definition is the second one, it overlaps with both the 
first clause and the third clause. In both cases, it can be checked that the clauses satisfy 
the criterion for confluence. It is also interesting to note that for the first clause to be 
of correct type, we need that plus zero n = n; while for the second clause we need that 
plus m zero = m. So this definition of concat relies in an essential way upon the fact 
that the definition of plus has overlapping clauses. 

7.1.4 Transitivity of the propositional equality 

Remember that the definition of the propositional equality =a as an inductive family 
only provides reflexivity of the relation. In order to prove that =a is symmetric and 
transitive, we have to give a proof by pattern matching. For example, here is a definition 
of trans : (x : A)(y : A)(z : A) — > x =a y — > y =a z — > x =a z: 



trans [y\ [y\ z (ref 1 y) q = q 

trans x [y\ [y\ p (ref 1 y) — p 



(7.6) 
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Remark that the patterns overlap; their unification is given by [x h-> y, z i— > y, p h-> 
refl y,q t-^- refl y]. Under this substitution, the right-hand sides are equal, hence the 
clauses are confluent. This shows that the confluence checker also works correctly in the 
presence of inaccessible patterns. 

Under the interpretation of proofs of x =a y as paths from x to y in homotopy type 
theory, these two clauses tell us that if we compose a path p with the identity path on 
either side, the result will be homotopic to p. 

7.1.5 A counterexample: multiplication of natural numbers 

Let us look at another function on natural numbers, multiplication: 

mult zero y = zero 

mult (sue x) y = plus (mult x y) y 

mult x zero = zero 

mult x (sue y) = plus x (mult x y) 

Let us focus on the confluence of the second and the fourth clause. If we re- 
name the variables as mult (sue x±) y 1 = plus (mult x\ y±) y\ and 
mult x 2 (sue y 2 ) = plus x\ (mult X\ yi), then unification of the patterns gives 
c = [x 2 !— > sue xi,yi h-> sue 7/2]- Under this substitution, the right-hand sides become 

plus (mult Xi (sue y 2 )) (sue y 2 ) — > sue (plus (plus x 1 (mult x 1 y 2 )) y 2 ) 

and 

plus (sue Xi) (mult (sue x±) y 2 ) — > sue (plus X\ (plus (mult X\ y 2 ) y 2 )) 

We see that the right-hand sides do not have the same normal form when we apply the 
substitution, so they are not definitionally equal. Hence this definition of mult does not 
satisfy the criterion for confluence. 

By first proving that plus is associative, it is possible to give a term 

(x : Nat) (y : Nat) h 

p : sue (plus (plus x (mult x y)) y) =Nat suc (p lus x (plus (mult x y) y)) 

that proves that the right-hand sides are propositionally equal. In the next section, we 
will consider whether it is sufficient to have such a propositional equality instead of a 
definitional equality. 



(7.7) 



7.2 Explicit proofs of confluence 

Instead of trying to check confluence automatically, we could ask the user for an explicit 
proof of confluence. We could represent a proof of confluence of two clauses / p\ — t\ 
and / p 2 = t 2 as a term of type T — > \p{\ = \p 2 \ — > ti = t 2 , where T is a context 
containing the pattern variables of p\ and p 2 . For example, to prove the confluence of the 
clauses plus zero y = y and plus x zero = x, we would have to give a term p of type 
(x : Nat)(y : Nat) — > zero =jjat x ~ * V =Nat zero — > x =Nat Vi which is n °t too hard: 

p [zeroj [zeroj (refl [zeroj) (refl [zeroj) = refl zero 
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In general, when the criterion for confluence is satisfied, it is never hard to give a similar 
proof of confluence. But is it enough to have such a proof? 

Let's take another look at the multiplication of natural numbers. We remarked before 
that it is possible to prove that this definition is confluent. But what happens if we try 
to evaluate mult (sue x) (sue y) where x and y are free variables? If we apply the second 
clause first, we get 

mult (sue x) (sue y) 

— > plus (mult x (sue y)) (sue y) 
— > plus (plus x (mult x y)) (sue y) 
— > sue (plus (plus x (mult x y)) y) 

If we apply the fourth clause first instead, we get 

mult (sue x) (sue y) 

— > plus (sue x) (mult (sue x) y) 
— > plus (sue x) (plus (mult x y) y) 
— > sue (plus x (plus (mult x y) y)) 

which is equal up to associativity of plus. But associativity is not a definitional equality, 
hence these are two different normal forms of the same term, violating the Church- Rosser 
property. It is true that both clauses will give the same result for closed arguments, but 
we also need confluence for arguments with free variables. Indeed, the definition of the 
Church- Rosser property is not limited to just closed terms. Hence the criterion we gave 
that uses definitional equality is the correct one, because it also guarantees confluence for 
open arguments. 



Chapter 8 



Link with non-overlapping 
definitions 

We have shown that overlapping function definitions can be very useful, but we also 
have to worry whether they are not too powerful: we want our addition to be consistent 
with the core type theory of UTT. Specifically, we do not want to introduce new closed 
terms of types that are supposed to be empty (such as _L). For definitions by pattern 
matching whose patterns form a covering, this is done by translating the definition to 
repeated application of eliminators [GMM06]. If the patterns of a definition do not form 
a covering, there is no hope to proceed in this way. 

It is not in scope of this thesis to give a full consistency proof, as consistency is usually 
the hardest property to prove about any version of type theory. However, we will prove 
that each new function definition we introduce is equivalent to an old one. In order to 
formulate the proposition, we first have to define what we mean by 'equivalent'. It is 
not realistic to ask that they are definitionally or propositionally equal, because both are 
intensional equalities: they care about how functions are defined, not just about their 
values. Hence we will define an equivalence ~ which is extensional: 

Definition 8.1 (Weak equivalence). We define an equivalence relation ps on 
terms called weak equivalence as the smallest equivalence relation satisfying the fol- 
lowing rules: 

• IfF\-t 1 = t 2 :T, then ti « t 2 . 

• If fi a ~ f 2 a for all closed terms a, then fi ps f 2 . 

• If h ~ fi and h ps t 2 , then fi ti P3 f 2 t 2 . 

Remark that the functions f\ and f 2 in the third rule can be dependent functions, hence 
the types of f\ t\ and f 2 t 2 can be different even if f\ and f 2 have the same type. Hence 
~ is a heterogeneous equivalence relation: it is possible that t\ ~ t 2 when t\ and t 2 do 
not have the same type. 

We can now formulate the theorem that gives the link between definitions with and 
without overlapping clauses: 
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Theorem 8.2. If a function f : (x : A) — > T is defined by a set of overlapping clauses 
that satisfy the criteria for completeness (6.1), termination (4-11), and confluence 
(6.2); then there exists a function f':(x:A)—>T whose patterns form a covering 




In order to prove the theorem, we first need lemma 8.3. 



Lemma 8.3. Let ti,t 2 be two terms with free variables in the context T. We have 
that ti ~ t 2 if and only if t\\T \-t a] ~ t 2 [T i-> a] for all closed terms a : T. 



Proof of the lemma. First, we prove that t\ ~ t 2 whenever ti[T i— > a] w ( 2 [T 4 a] for 
all closed terms a : T by induction on the length of the context T. If T = e, then the 
lemma is trivial. So suppose T = (x : T)T' and let a : T be a closed term. Then t±[x h- >■ 
a][r' i— )> a'] ~ i 2 [x ^ a][T' >->■ a'] for all closed terms a' : I"" by assumption. By the 
induction hypothesis, we have t±[x >->■ a] ^ t 2 [x a]. Now define f\ = X(x : T). ti and 
h — H x '■ T). t 2 . Then fi a Ri ^[x h-> a] t 2 [:r h-> a] w / 2 a for all closed terms a : T, 
so /i w / 2 by the second rule. By the third rule, we have t\ Ri /i rr Ri f 2 x Ri t 2 , as we 
wanted to prove. 

Now suppose t x Ri t 2 and let a be any substitution, then we will prove that t±a Ri t 2 o" 
by induction on the derivation of ti ~ t 2 : 

• If T h t x = t 2 : T, then we have To" h tx°" = ^2°" : ^cr, hence t±a ~ t 2 a. 

• If ti a Ri t 2 a for all closed terms a, then we have (tx o)c ~ (^2 a>)cr by the induction 
hypothesis. Because a is closed, this implies (tier) a Ri (t 2 er) a. This holds for any 
closed a, hence tier Ri t 2 er. 

• If tx = /i Si and t 2 = / 2 s 2 for f\ Ri / 2 and si ~ s 2 , then we have f\o Ri / 2 er 
and Sxer ~ s 2 cr by the induction hypothesis. Hence we have txer = (/icr) (sxer) ~ 
(/ 2 cr) (s 2 er) = t 2 cr. 

If we take cr = [r (->■ a] in particular, then we get tx[r h-> a] Ri t 2 [r i— > a] , as we wanted to 
prove. □ 

Proof of theorem 8.2. Let P be the set of patterns in the definition of /. Because the 
clauses of / satisfy the requirements of 6.1, there exists a covering O such that for each 
p G O, there exists a q G P such that q D p. In other words, for all p G O there exists a 
clause f q = t for / and a substitution er such that c/er = p. The function /' will be defined 
by the clauses /' p = t[f i— > f']o~ for all p G O. We check that this is a valid definition: 

• The set of patterns O is a covering, hence the patterns are valid and complete. 

• The arguments of all recursive calls / s in the right-hand side of a clause / q — t 
satisfy s ~< \q~\. This implies ser -< \q\cr = \p] by lemma 4.12. This implies that the 
definition of /' is terminating by theorem 4.11. 

• The patterns in O do not overlap, hence the definition of /' is confluent. 

Now we will prove that / Ri /'. By definition of Ri, it is sufficient to prove that 
/ a Ri /' a for all closed a : A. We will proceed by well-founded induction on the 
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structural order -<. By completeness of /', there exists a clause /' p — s such that 
MATCH(p, a) =>- t. This gives us that a — \p~\r and /' a — > sr. By construction of /', 
the clause /' p — s is of the form /' qa — t[f i— >■ f'\a where / q = t is a clause of /. This 
implies that a — \p~\r — \q]<JT and hence / a — > tar = s[f h-> f]r. 

We claim that / u ~ /' u for each recursive call /' u in sr. By lemma 8.3, it is 
sufficient to prove that / (u[T i-)- b]) « /' (u[r >->• 6]) where T contains all free variables 
of u and 6 : Y is closed. By the criterion for termination and lemma 4.12, we have that 
u -< a, hence u[T i— > b] -< a[T i— y b] = a by lemma 4.12 and the fact that a is closed. So by 
the induction hypothesis, we have / (u[T i-y b]) ~ /' («[r h> t]). 

We now have / u ~ /' u for each recursive call in sr. By applying the definition of ps 
and lemma 8.3 repeatedly, we get that st s[/' H> /]r. So we have that / a w s[/' H> 
/]r ~ st ~ /' o for all closed a, which implies / w /', as we wanted to prove. □ 



Chapter 9 

Conclusion and future work 



The main goal of this thesis is to make dependent pattern matching more intuitively us- 
able for specialists and non-specialists alike. We try to do this by fixing some discrepancies 
between the definitions given by the user and the internal representation of these defini- 
tions. To do this, we extend the semantics of pattern matching to sets of patterns that do 
not necessarily form a covering. In particular, we allow overlapping patterns. This allows 
us to give a lot of natural definitions that behave as we would expect them to. 

In practice, a typical user would probably start by giving a non-overlapping definition 
and add overlapping clauses when he has a need for them. For example, when giving 
the clause concat v e — v for the concatenation operator on vectors, the type checker 
complains that the length plus n zero of the left-hand side does not equal the length n of 
the right-hand side. The user can then add the clause plus n zero — n to the definition 
of plus, after which the type checker doesn't complain any more. This blends well with 
the typical interactive development of dependently typed programs in Agda. 

The current implementation is still very experimental. It would be interesting to give 
a full implementation that is compatible with extensions of pattern matching in Agda 
such as wildcard patterns, 'with'-expressions, and coinductive data types. It should also 
be possible to implement the pattern matching described in this thesis in other languages 
with dependent pattern matching, for example Coq, Epigram, or ATS. 

There are some limits to our approach: the confluence checker doesn't always see that 
a definition is confluent. This occurs when inaccessible patterns overlap with constructor 
patterns or other inaccessible patterns. This could be solved by improving the unification 
algorithm for patterns. Another case where the confluence check fails, is the multiplica- 
tion on natural numbers. This problem is not easily solved by improving the confluence 
checker, however. Rather, it depends crucially on the question whether we want to see 
I + (m + n) and (I + m) + n as 'the same' even if I, m and n are free variables. The cur- 
rent definition of type theory doesn't allow these kind of non-computational (undirected) 
definitional equalities. To allow such definitions, we would need to introduce the theory 
of AC (associative-commutative) rewriting [BP85] to type theory. 

As with any modification to type theory, there is the question of soundness. Specifi- 
cally, does our extended form of pattern matching allow us to give a closed term of type 
_L? One would certainly hope that the answer is no. We think that our theorem 8.2 gives 
a step in the right direction, but it is too weak because it doesn't say much about the 
reduction rules involved. It is an interesting question what extra requirements we need 
to obtain a stronger equivalence with non-overlapping definitions. 
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Appendix A 
Implementation 



In chapter 6, we discussed the conceptual issues with the implementation of definitions 
with overlapping patterns, in specific the implementation of the confluence checker. We 
also gave some examples of definitions with overlapping patterns in chapter 7. In this ap- 
pendix, we want to give a few additional remarks that are specific to the implementation of 
these features in Agda. First, we describe how definitions with overlapping patterns can be 
given; and second, we describe some issues with the implementation itself. As of this writ- 
ing, a working version of Agda with our modifications has not yet been released. If you are 
interested in a copy, please contact the author at: jesper . cockx@student .kuleuven.be. 



A.l Usage 

In order to maintain backwards compatibility with functions that use the first-match 
semantics, we added a new keyword overlapping to Agda that marks all clauses in the 
next definition to be interpreted as definitional equalities. Function definitions without 
this keyword are still translated to a case tree internally, according to the first-match 
semantics. Figure A.l gives an example of how this keyword is used. 

An overlapping keyword can be followed by a block containing multiple functions. 
The end of the block is determined by the indentation. Figure A. 2 gives an example of 
such an overlapping block. This works in the same way as other keywords in Agda. If 
used in this way, an overlapping block doubles as a mutual block in Agda: all functions 
in the block can refer to each other, irrespective of their order. Confluence checking is 
also done for all definitions in the block at once. This could be useful if the confluence of 
two definitions each depends on the other definition. 

In order to allow definitions such as plus in figure A.l, the built-in check for unreach- 
able clauses has been disabled. 



overlapping 

_+_ : N -» N -> N 

zero + n = n 

(sue m) + n = sue (m + n) 

m + zero = m 

m + (sue n) = sue (m + n) 



Figure A.l: Definition of addition with overlapping patterns in Agda. This definition is 
accepted by the confluence checker. 
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overlapping 

_A_ : Bool -» Bool -» Bool 
x a false = false 
false /\y = false 
x a true = x 
true Ay = y 

v : Bool -» Bool -» Bool 
x v true = true 
true v y = true 
x v false = x 
false v y = y 

Figure A. 2: Overlapping block with multiple functions. Although it is not visible in this 
example, it is allowed that the functions in the block mutually refer to each other. 

overlapping 

_*_ : M -± N ± M 

zero * n = zero 

m * zero = zero 

sue m*n = n + (m * n) 

m * sue n = (m * n) + m 
While checking overlap, @0 and @0 + (@1 * @0) were unequal. 

Figure A. 3: Definition of multiplication with overlapping patterns. This definition is 
rejected by the confluence checker. At the moment, something is preventing the right- 
hand sides in the error message from being displayed correctly. 

Two new error messages have been added to Agda, corresponding to the two errors 
described in section 6.2. The first one occurs when the unification of two patterns succeeds 
positively, and the right-hand sides were not found to be equal after this unification. 
Figure A. 3 gives an example where this error occurs. 

The second error occurs when the unification of two patterns fails, violating the as- 
sumption of theorem 6.2. An example where this error occurs is given in figure A. 4. 

data _-_ {A : Set} : A -> A -» Set where 
eq : (x : A) -> x = x 
neq : (x y : A) -» x ^ y 

overlapping 

f : lx y i Bool) -» x k y -± Bool 

f true true p_ = true 

f true .y ( neq .true y) = y 

f false y p_ = y 

Couldn't determine whether patterns overlap. 

Figure A. 4: An example where the unification algorithm fails. This could be improved 
by using a better unification algorithm. 
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A. 2 Implementation issues 

Here are some remarks on the problems we encountered when working with the source 
code of Agda. 

• Adding a single field to the internal representation of a function in order to dis- 
tinguish between old and new definitions caused quite a bit of trivial but non-local 
changes to the code. This is caused by the fact that the Agda source code often 
refers to the specific representation of a function definition, which could perhaps be 
done better by using getters for each field. 

• The built-in unification algorithm of Agda didn't work for unifying patterns because 
it assumed that the De Bruijn-indices of the variables are disjoint. This is not the 
case for patterns, as each clause has its own scope. Hence we implemented a simple 
two-pass unification algorithm for patterns. In the first pass, the accessible parts of 
the patterns are unified, and in the second pass, pattern variables and inaccessible 
patterns are unified. 

• The current Agda implementation didn't provide any substitution that directly sub- 
stitutes a term for a variable without shifting the De Bruijn-indices. This kind of 
direct substitution was needed for the confluence checker, so we had to add it our- 
selves. The reason that this was needed, was that the variables in two different 
clauses can have the same De Bruijn-indices, without being 'the same' variable. 

Hopefully, these remarks can also be useful for someone implementing a confluence checker 
for another language. 
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